Online Server Support

  • Subscribe to our RSS feed.
  • Twitter
  • StumbleUpon
  • Reddit
  • Facebook
  • Digg

Monday, 9 April 2007

Announcing the OCRopus Open Source OCR System

Posted on 08:12 by Unknown
Posted by Thomas Breuel, OCRopus Project Leader

We're happy to announce the OCRopus OCR Project, a Google-sponsored project to develop advanced OCR technologies in the IUPR research group, headed by Prof. Thomas Breuel at the DFKI (German Research Center for Artificial Intelligence, Kaiserslautern, Germany).

The goal of the project is to advance the state of the art in optical character recognition and related technologies, and to deliver a high quality OCR system suitable for document conversions, electronic libraries, vision impaired users, historical document analysis, and general desktop use. In addition, we are structuring the system in such a way that it will be easy to reuse by other researchers in the field.

The OCRopus engine is based on two research projects: a high-performance handwriting recognizer developed in the mid-90's and deployed by the US Census bureau, and novel high-performance layout analysis methods.

The project is expected to run for three years and support three Ph.D. students or postdocs. We are announcing a technology preview release of the software under the Apache license (English-only, combining the Tesseract character recognizer with IUPR layout analysis and language modeling tools), with additional recognizers and functionality in future releases.

The IUPR research group has extensive experience in OCR and related technologies, and will be basing the work on previous research and existing software in the area. Existing software components include high-performance handwriting recognition software that has received top evaluations by NIST and was deployed by the US Census Bureau, the recently open sourced Tesseract OCR system, a separate Google project for probabilistic natural language modeling, and software for layout analysis and character recognition. The IUPR research group gratefully acknowledges funding by the German BMBF, the state of Rhineland Palatinate, and other public and private partners (please see www.iupr.org for more details).

We are hoping for contributions by the open source community in areas such as adapting the system to additional languages, creating a Gnome desktop application, integration with Gnome desktop search, web-based tools for proofing and training, language modeling, additional character recognition engines, and other useful tools and add-ons.

The project web page can be found at ocropus.org.
Email ThisBlogThis!Share to XShare to Facebook
Posted in ocr, open source | No comments
Newer Post Older Post Home

0 comments:

Post a Comment

Subscribe to: Post Comments (Atom)

Popular Posts

  • Google Summer of Code & Danish Linux Forum
    Posted by Leslie Hawthorn, Open Source Team The Danish Linux Conference is celebrating its tenth anniversary this year, and the date is com...
  • Weekly Google Code Roundup for July 2-6th
    By Dion Almaer, Google Developer Programs Having the July 4th holiday smack in the middle of the week creates a strange week when it is hard...
  • Weekly Google Code Roundup for June 11-15th
    By Dion Almaer, Google Developer Programs In API and developer-product news... I will start by going meta. Linking to a roundup from a round...
  • Weekly Google Code Roundup for July 16-20th
    By Dion Almaer, Google Developer Programs This week we have the pleasure of having MashupCamp hosted walking distance from the Googleplex. I...
  • Weekly Google Code Roundup for July 23-27th
    By Dion Almaer, Google Developer Programs It has been a busy time for conferences. From MashupCamp last week, to OSCON and The Ajax Experien...
  • Google Gadget Ventures
    By Tom Stocky, Google Developer Programs Good news for Google Gadget developers. We've just launched Google Gadget Ventures , a new pil...
  • Weekly Google Code Roundup for July 8-12th
    By Dion Almaer, Google Developer Programs In API and developer-product news... Othman Laraki talked about the Gears roadmap and development ...
  • Google Developer Day sessions move to San Jose Convention Center
    Posted by Andrew Bowers, Google Developer Programs Thanks to the incredible interest in Google Developer Day, we've moved the session po...
  • Google Sitemaps Launches
    Today, Google launched Google Sitemaps , a new service designed for webmasters that enables them to automatically submit their web pages to ...
  • Google Developer Podcast Episode Four: Mark Limber on Google SketchUp
    By Dion Almaer, Google Developer Programs Using iTunes? We have published the fourth episode of the Google Developer Podcast, which feature...

Categories

  • 20% project
  • 3d
  • accessibility
  • advogato
  • ajax
  • ajax search
  • ajax search books news apis
  • amarok
  • android
  • apache
  • apis
  • apis. charts
  • apple
  • atom publishing protocol
  • axsjax
  • barcodes
  • blogger
  • building ajax apps
  • c++
  • caja
  • calendar
  • camino
  • chronoscope
  • cifs
  • cms
  • collada
  • community
  • conferences
  • cricket
  • cryptography
  • danish linux forum
  • developer
  • django
  • documentation
  • dojo
  • dot net
  • dreamweaver
  • drupal
  • eclipse
  • eclipsecon
  • education
  • email
  • events
  • feeds
  • firevox
  • fosdem
  • freebsd
  • freenet
  • gadgets
  • gcc
  • gdata
  • gdd07
  • geoserver
  • getpaid
  • ghop
  • gnome
  • gnome women's summer outreach program
  • Google
  • google apps for your domain
  • google chart api
  • google checkout
  • google code
  • google code project hosting
  • google code search
  • google data apis
  • google developer day
  • google earth
  • google gadgets
  • google gears
  • google grants
  • google mashup editor
  • google summer of code
  • google web toolkit
  • green linux
  • gsoc
  • gtags
  • guice
  • GWSOP
  • gwt
  • haproxy
  • hibernate
  • howto
  • hpux
  • html
  • html5
  • igoogle
  • image search
  • Imara
  • interviews
  • java
  • javascript
  • joomla
  • joomladayus2007
  • joomladayusa
  • karaoke
  • KDE
  • KDE 4.0
  • kernel
  • kernel summit
  • kml
  • linux
  • linux foundation
  • linux summit
  • linux virtual server
  • linuxconf eu
  • LoCo
  • london
  • mac
  • MacFuse
  • maps
  • meetup
  • MIT CSAIL
  • mobile
  • mylar
  • MySQL
  • mythtv
  • named
  • netbsd
  • nss
  • objective-c
  • OCaml
  • ocr
  • ODF
  • oha
  • OOXML
  • open source
  • openajax alliance
  • opensocial
  • openssl
  • oreilly
  • oscon
  • oscon2007
  • oss devs
  • ossjam
  • osx
  • pactester
  • phone
  • picasa
  • picasa web
  • plone
  • plone sprint
  • podcast
  • portugal
  • programming
  • py3k
  • python
  • python sprint
  • reader
  • research
  • samba
  • scalability
  • screencast
  • security
  • shindig
  • silverstripe
  • sitemaps
  • sixapart
  • sketchup
  • soc
  • solaris
  • spa2007
  • speakers
  • standards
  • student programs
  • subversion
  • summer of code
  • syndication
  • testing
  • themes
  • topp
  • ubucon
  • ubuntu
  • unit test
  • unix
  • video
  • Vim
  • weekly roundup
  • windows
  • windows programming
  • Winter of Code
  • youtube
  • zurich
  • ZXing

Blog Archive

  • ►  2008 (7)
    • ►  January (7)
  • ▼  2007 (159)
    • ►  December (8)
    • ►  November (13)
    • ►  October (16)
    • ►  September (11)
    • ►  August (16)
    • ►  July (11)
    • ►  June (14)
    • ►  May (13)
    • ▼  April (12)
      • Open Source Developers @ Google Speaker Series: An...
      • Open Source Awards nominations - final call
      • Google releases patches that enhance the manageabi...
      • Recapping the Atom Publishing Protocol interoperab...
      • Introducing the Google AJAX Feed API
      • Google Macintosh group announces Objective-C libra...
      • Additional spots added at Google Developer Day
      • Check out the new Google Summer of Code blog
      • Announcing Google Developer Day
      • Open Source Developers @ Google Speaker Series: Al...
      • Announcing the OCRopus Open Source OCR System
      • Google Grants - Free Advertising for Open Source N...
    • ►  March (19)
    • ►  February (14)
    • ►  January (12)
  • ►  2006 (98)
    • ►  December (10)
    • ►  November (14)
    • ►  October (13)
    • ►  September (11)
    • ►  August (14)
    • ►  July (9)
    • ►  June (5)
    • ►  May (5)
    • ►  April (6)
    • ►  March (4)
    • ►  February (2)
    • ►  January (5)
  • ►  2005 (40)
    • ►  December (4)
    • ►  November (1)
    • ►  October (3)
    • ►  September (2)
    • ►  August (5)
    • ►  July (3)
    • ►  June (11)
    • ►  May (2)
    • ►  April (4)
    • ►  March (5)
Powered by Blogger.

About Me

Unknown
View my complete profile