blackantt 发表于 2024-4-15 07:42:12

chatGPT或者其它办法能给自动生成的字幕加上标点吗?(不要用whisper等使用本地GPU...

本帖最后由 blackantt 于 2024-4-15 09:42 编辑

自动字幕往往没有标点,并且一个完整句子被放在2条 或者多条字幕里。 有啥办法能给它加上正确的标点呢?(不要用whisper等使用本地GPU的方法)
比如:


1
00:00:00,000 --> 00:00:04,000
Guess how much I love you

2
00:00:04,000 --> 00:00:11,000
One autumn morning little nut-brown hair big nut-brown hair and little field mouse

3
00:00:11,000 --> 00:00:18,000
sat bathed in the early morning light fascinated by one of Little White Owl's stories

4
00:00:18,500 --> 00:00:19,000
what a nice day!

如果不好弄,要不,问题简化一下吧。怎么样把已经加好标点的文本对应回原始没标点的srt文件里?见3楼的详细说明
.
.
.

goodzmq 发表于 2024-4-15 08:23:01

我来瞅瞅吧!

blackantt 发表于 2024-4-15 09:35:04

本帖最后由 blackantt 于 2024-4-15 09:37 编辑

goodzmq 发表于 2024-4-15 08:23
我来瞅瞅吧!

要不,问题简化一下吧。怎么样把已经加好标点的文本对应回原始没标点的srt文件里?
比如,原始文件 1.srt 如下:
1
00:00:00,000 --> 00:00:04,000
Guess how much I love you

2
00:00:04,000 --> 00:00:11,000
One autumn morning little nut-brown hair big nut-brown hair and little field mouse

3
00:00:11,000 --> 00:00:18,000
sat bathed in the early morning light fascinated by one of Little White Owl's stories

4
00:00:18,500 --> 00:00:19,000
what a nice day!

我先把它变成没有序号行,没有时间轴的文件行,如下:
Guess how much I love you
One autumn morning little nut-brown hair big nut-brown hair and little field mouse
sat bathed in the early morning light fascinated by one of Little White Owl's stories
what a nice day

然后对文本加标点如下:
Guess how much I love you.
One autumn morning, little nut-brown hair, big nut-brown hair and little field mouse
sat bathed in the early morning light, fascinated by one of Little White Owl's stories.
what a nice day!

现在,最后一步,怎么把已经有标点的(逗号,句号,问号,叹号等)文本对应写回原始时间轴里呢? 这属于查找,定位,替换问题吗? 有没有现成模块可用? 没有的话,能否实现一下

不二如是 发表于 2024-4-18 19:41:44

就把你的需求,直接跟它说就好

blackantt 发表于 2024-4-20 23:30:28

不二如是 发表于 2024-4-18 19:41
就把你的需求,直接跟它说就好

怎么叫它忽略序号行及时间轴行呢?可以对文本加标点。但直接对srt加不了
页: [1]
查看完整版本: chatGPT或者其它办法能给自动生成的字幕加上标点吗?(不要用whisper等使用本地GPU...