Why LLMs still have problems with OCR
Document ingestion and the launch of Gemini 2.0 caused a lot of buzz this week. As a team building in this space, this is something we researched thoroughly. Here’s our take: ingestion is a multistep pipeline, and maintaining confidence from LLM nondeterministic outputs over millions of pages is a problem. Comments URL: https://news.ycombinator.com/item?id=42966958 Points: 94 # Comments: 72
![Why LLMs still have problems with OCR](https://cdn.prod.website-files.com/6707c5683ddae1a50202bac6/67a521a33f69fd548661a0c1_V2.png)
Document ingestion and the launch of Gemini 2.0 caused a lot of buzz this week. As a team building in this space, this is something we researched thoroughly. Here’s our take: ingestion is a multistep pipeline, and maintaining confidence from LLM nondeterministic outputs over millions of pages is a problem.
Comments URL: https://news.ycombinator.com/item?id=42966958
Points: 94
# Comments: 72