第一步:将文章以句号形式分开,并标号
第二步:使用结巴遍历每一句,并分词
第三步:使用txt导入excel
------------------------------------------------------------------
参考自己的文章:
第一篇:python(给每行开头添加序号)&(每行末尾添加序号) 
<https://blog.csdn.net/qq_19741181/article/details/79824433>
第二篇:python【jieba】如何换行 (分词同时) 
<https://blog.csdn.net/qq_19741181/article/details/79823964>| pythonjieba 分词 
结束后用txt打开() <https://blog.csdn.net/qq_19741181/article/details/79823489>
------------------------------------------------------------------
省略了部分步骤,可以参考第一篇
代码:
>>> import jieba >>> with open('E:/000.txt','r')as f: ... for line in f: ... 
seg = jieba.cut(line.strip(),cut_all = False) ... output = '/'.join(seg) ... 
output = output+'\n' ... with open('E:/jieba123.txt','a+')as s: ... 
s.write(output) ... Building prefix dict from the default dictionary ... 
Loading model from cache C:\Users\oil\AppData\Local\Temp\jieba.cache Loading 
model cost 0.913 seconds. Prefix dict has been built succesfully. 7 1 70 1 62 
128 59 57 76 87 174
热门工具 换一换