python - Delete every non utf-8 symbols froms string -



python - Delete every non utf-8 symbols froms string -

i have big amount of files , parser. have strip non utf-8 symbols , set info in mongodb. have code this.

with open(fname, "r") fp: line in fp: line = line.strip() line = line.decode('utf-8', 'ignore') line = line.encode('utf-8', 'ignore')

somehow still error

bson.errors.invalidstringdata: strings in documents must valid utf-8: 1/b62010montecassianomcir\xe2\x86\x90ta0\xe2\x86\x90008923304320733/290066010401040101506055soccorin

i don't it. there simple way it?

upd: seems python , mongo don't agree definition of utf-8 valid string.

try below code line instead of lastly 2 lines. hope helps:

line=line.decode('utf-8','ignore').encode("utf-8")

python mongodb encode

Comments

Popular posts from this blog

php - How to pass multiple values from url -

database - php search bar when I press submit with nothing in the search bar it shows all the data -

ios - How to load .png images from Documents folder of an app -