[已解决]第35课第5题统计各类代码的文件数及行数

咖啡的旅游记 · 发表于 2018-5-17 00:24:59

马上注册，结交更多好友，享用更多功能^_^

您需要登录才可以下载或查看，没有账号？立即注册

x

压缩包里有答案和自己的代码.
自己写的代码与答案统计的行数不一致.求解

import easygui as g
import os
import os.path as op
import sys

#获取文件列表
def acquire_path(path_dft):
os.chdir(path_dft)
for each_file in os.listdir(os.curdir):
      (file_name,file_type)=op.splitext(each_file)
      if file_type  in file_type_list:
         path_list.extend([os.getcwd() + os.sep+each_file])
      if op.isdir(each_file):
         acquire_path(each_file)
         os.chdir(os.pardir)

#获取每个文件的内容行数及对应的文件名
def count_file_lines(path_list):
file_line_num = []
for each_file in path_list:
      line_count=0
      try:
         with open(each_file,encoding="UTF-8") as f:
            for i in f:
                  line_count+=1
      except UnicodeDecodeError:
         pass  # 不可避免会遇到格式不兼容的文件，这里忽略掉......
      file_line_num.extend([[op.basename(each_file),line_count]])
return file_line_num

# 定义一个字典用于存放文件类型key 及对应的 (个数和行数)values
def Save_result_dict(file_list):
show_list_textbox={}
# filetype_num = 0
# show_file_count = 0
for each_file in file_list:
      if op.splitext(each_file[0])[1] in show_list_textbox.keys():
         # filetype_num=show_list_textbox[op.splitext(each_file[0])[1]][0][0]+1
         # show_file_count=show_list_textbox[op.splitext(each_file[0])[1]][0][1]+int(each_file[1])
         show_list_textbox[op.splitext(each_file[0])[1]] = ([show_list_textbox[op.splitext(each_file[0])[1]][0][0]+1, show_list_textbox[op.splitext(each_file[0])[1]][0][1]+int(each_file[1])],)
      else:
         show_list_textbox[op.splitext(each_file[0])[1]] = ([1, int(each_file[1])],)
return show_list_textbox

# 讲获取的文件名及对应的行数按要求显示在textbox里
def login_main_load():
str_list = ''
sum_daima_copunt = 0

acquire_path(g.diropenbox(msg="选择路径", title="请", default=None))
nop = Save_result_dict(count_file_lines(path_list))

for each_jg in nop.keys():
      str_list += '[%s]源文件[%s]个,源代码%s行' % (each_jg, nop[each_jg][0][0], nop[each_jg][0][1]) + '\n'
      sum_daima_copunt += int(nop[each_jg][0][1])
msg = "您目前积累编写了%s行代码," % sum_daima_copunt + "完成进度%0.2f" % (sum_daima_copunt / 1000) + '%' + '\n' + '离10万行代码还差%s行,请继续努力!' % (100000 - sum_daima_copunt)

g.textbox(msg=msg, title="学习成果", text=str_list)

path_list = []
file_type_list=['.py','.txt','.htm']

try:
login_main_load()
except:
sys.exit("路途终止")

# import easygui as g
# import os
#
#
# def show_result(start_dir):
#    lines = 0
#    total = 0
#    text = ""
#
#    for i in source_list:
#       lines = source_list[i]
#       total += lines
#       text += "【%s】源文件 %d 个，源代码 %d 行\n" % (i, file_list[i], lines)
#    title = '统计结果'
#    msg = '您目前共累积编写了 %d 行代码，完成进度：%.2f %%\n离 10 万行代码还差 %d 行，请继续努力！' % (total, total / 1000, 100000 - total)
#    g.textbox(msg, title, text)
#
#
# def calc_code(file_name):
#    lines = 0
#    with open(file_name) as f:
#       print('正在分析文件：%s ...' % file_name)
#       try:
#          for each_line in f:
#                lines += 1
#       except UnicodeDecodeError:
#          pass  # 不可避免会遇到格式不兼容的文件，这里忽略掉......
#    return lines
#
#
# def search_file(start_dir):
#    os.chdir(start_dir)
#
#    for each_file in os.listdir(os.curdir):
#       ext = os.path.splitext(each_file)[1]
#       if ext in target:
#          lines = calc_code(each_file)  # 统计行数
#          # 还记得异常的用法吗？如果字典中不存，抛出 KeyError，则添加字典键
#          # 统计文件数
#          try:
#                file_list[ext] += 1
#          except KeyError:
#                file_list[ext] = 1
#          # 统计源代码行数
#          try:
#                source_list[ext] += lines
#          except KeyError:
#                source_list[ext] = lines
#
#       if os.path.isdir(each_file):
#          search_file(each_file)  # 递归调用
#          os.chdir(os.pardir)  # 递归调用后切记返回上一层目录
#
#
# target = ['.c', '.cpp', '.py', '.cc', '.java', '.pas', '.asm','.txt','.htm']
# file_list = {}
# source_list = {}
#
# g.msgbox("请打开您存放所有代码的文件夹......", "统计代码量")
# path = g.diropenbox("请选择您的代码库：")
#
# search_file(path)
# show_result(path)

最佳答案

月排行榜 / 总排行榜

thexiosi

2018-5-17 11:30:09

hi 仔细看了一遍代码，原因出在这行 ' with open(each_file,encoding="UTF-8") as f: '

如果按默认方式打开文件，即：' with open(each_file) as f:' ，这样可以正确处理你列表中的非.py文件，但是当处理到 .py文件时，会触发UnicodeDecodeError，导致.py无法统计

如果按  encoding="utf-8"打开，这样可以正常处理.py文件、不含中文字符的txt文件等，这种方式，当处理到含中文字符的txt文件时，会将文件的行数统计为0，也就是你遇到的现象

不好意思，由于能力有限，目前还没有找到如何处理这种问题的方法，你可以等等大牛的回复

def count_file_lines(path_list):
file_line_num = []
for each_file in path_list:
      line_count=0
      try:
         with open(each_file,encoding="UTF-8") as f:  #here
            for i in f:
                  line_count+=1
      except UnicodeDecodeError:
         pass  # 不可避免会遇到格式不兼容的文件，这里忽略掉......
      file_line_num.extend([[op.basename(each_file),line_count]])
return file_line_num

跳转到最佳答案楼层

thexiosi · 发表于 2018-5-17 11:30:09

这个最佳答案由 thexiosi 给出，感谢 thexiosi 的回答。

单击隐藏图章

hi 仔细看了一遍代码，原因出在这行 ' with open(each_file,encoding="UTF-8") as f: '

如果按默认方式打开文件，即：' with open(each_file) as f:' ，这样可以正确处理你列表中的非.py文件，但是当处理到 .py文件时，会触发UnicodeDecodeError，导致.py无法统计

如果按  encoding="utf-8"打开，这样可以正常处理.py文件、不含中文字符的txt文件等，这种方式，当处理到含中文字符的txt文件时，会将文件的行数统计为0，也就是你遇到的现象

不好意思，由于能力有限，目前还没有找到如何处理这种问题的方法，你可以等等大牛的回复

def count_file_lines(path_list):
file_line_num = []
for each_file in path_list:
      line_count=0
      try:
         with open(each_file,encoding="UTF-8") as f:  #here
            for i in f:
                  line_count+=1
      except UnicodeDecodeError:
         pass  # 不可避免会遇到格式不兼容的文件，这里忽略掉......
      file_line_num.extend([[op.basename(each_file),line_count]])
return file_line_num

咖啡的旅游记 · 发表于 2018-5-17 11:59:02

thexiosi 发表于 2018-5-17 11:30
hi 仔细看了一遍代码，原因出在这行 ' with open(each_file,encoding="UTF-8") as f: '

如果按默认方式 ...

是的，这个问题确实存在。看小甲鱼的代码里也提到了类似的问题，我也没找到好的办法。

咖啡的旅游记 · 发表于 2018-5-18 22:44:15

咖啡的旅游记发表于 2018-5-17 11:59
是的，这个问题确实存在。看小甲鱼的代码里也提到了类似的问题，我也没找到好的办法。

用答案的代码统计我的源文件是70行,我自己写的统计的是1500行.不知道问题出在哪里.

账号		自动登录	找回密码
密码			立即注册

[已解决]第35课第5题 统计各类代码的文件数及行数

马上注册，结交更多好友，享用更多功能^_^

[已解决]第35课第5题统计各类代码的文件数及行数