图片版的PDF文件,怎么才能借助AI工具来提取其中全部的文字内容呢?
第一步:将PDF文件转换成图片格式
具体方法参见文章:《零代码编程:用kimichat将图片版PDF自动批量分割成多个图片》
第二步:识别图片中的文字
将第一步pdf转换成的图片,上传到kimichat
注意:kimichat目前上传图片一次最多50张图片,单个大小不超过100M
上传完成后,kimichat会进行解析。
部分图片会提示:未提取到文字或者解析失败
点击这些解析失败图片的右上角红色X,把这些无法解析的图片删除掉
然后回车,就全部识别出来到了。但是,识别的顺序不是按照文件标题名来的,有些乱,可以让kimichat调整下:
请按照图片标题顺序排列
Kimichat最终的输出结果:
当然,根据您提供的图片标题顺序,这里是整理后的文字内容:
- **page_5.png:**
- I'll Be α
- COWBOY
- Anita Bijsterbosch
- Clavis
- NEWYORK
- **page_6.png:**
- "Mommy, there's a party at school
- tomorrow, and we get to wear
- costumes!" Sammy says.
- **page_8.png:**
- "I think I'll be
- a cowboy.
- lasso"
- **page_9.png:**
- cowboyhat
- Shirt .........
- vest
- ....
- Fringed pants
- cowboy boots
- spurs
- **page_10.png:**
- "Yee-Haw!"
- **page_12.png:**
- "Or wait, I'll go as
- a firefighter."
- dragon
- walkie-talkie
- Fire engine
- Flashlight
- **page_13.png:**
- helmet
- Firefighter suit
- gloves
- Firefighter boots
- **page_15.png:**
- "Fire!"
- **page_16.png:**
- "Hmm, perhaps
- I'll be a pirate."
- parrot
- JollyRoger
- treasure map
- treasure chest
- spyglass
- **page_17.png:**
- bandana
- eyepatch
- striped T-shirt
- ......
- belt
- trousers
- buccaneer
- boots
- **page_19.png:**
- Land
- ho!"
- **page_20.png:**
- "Or I could
- gc
- as a knight.
- castle
- horse
- cuddly toy
- Feather
- dragon
- Sword"
- **page_21.png:**
- helmet
- armor
- shield
- gauntlet
- boots
- **page_23.png:**
- Tan-tan-tara!
- **page_24.png:**
- "I could be
- a magician...
- wand"
- **page_25.png:**
- magic hat
- OD
- boots
- **page_26.png:**
- "+"
- **page_27.png:**
- Abracadabra!
- **page_28.png:**
- "Or maybe
- I should
- just go
- as myself?"
- **page_29.png:**
-
T-shirt
-
overalls
-
boots
这些文字内容似乎是从一本关于职业和角色扮演的儿童书中提取的。如果您需要进一步的信息或帮助,请随时联系我。