第一步:将文章以句号形式分开,并标号
第二步:使用结巴遍历每一句,并分词
第三步:使用txt导入excel
------------------------------------------------------------------
参考自己的文章:
第一篇:python(给每行开头添加序号)&(每行末尾添加序号)
<https://blog.csdn.net/qq_19741181/article/details/79824433>
第二篇:python【jieba】如何换行 (分词同时)
<https://blog.csdn.net/qq_19741181/article/details/79823964>| pythonjieba 分词
结束后用txt打开() <https://blog.csdn.net/qq_19741181/article/details/79823489>
------------------------------------------------------------------
省略了部分步骤,可以参考第一篇
代码:
>>> import jieba >>> with open('E:/000.txt','r')as f: ... for line in f: ...
seg = jieba.cut(line.strip(),cut_all = False) ... output = '/'.join(seg) ...
output = output+'\n' ... with open('E:/jieba123.txt','a+')as s: ...
s.write(output) ... Building prefix dict from the default dictionary ...
Loading model from cache C:\Users\oil\AppData\Local\Temp\jieba.cache Loading
model cost 0.913 seconds. Prefix dict has been built succesfully. 7 1 70 1 62
128 59 57 76 87 174
热门工具 换一换