OpenHarmony集成OCR三方库实现文字提取

点击蓝字 ╳ 关注我们
开源项目 openharmony是每个人的 openharmony 郭岳峰
深圳开鸿数字产业发展有限公司
os框架开发工程师
以下内容来自嘉宾分享，不代表开放原子开源基金会观点 1. 简介
tesseract(apache 2.0 license)是一个可以进行图像ocr识别的c++库，可以跨平台运行。本样例基于tesseract库进行适配，使其可以运行在openatom openharmony（以下简称“openharmony”）上，并新增n-api接口供上层应用调用，这样上层应用就可以使用tesseract提供的相关功能。 2. 效果展示
识别文字身份信息识别提取文字信息到本地文件相关代码已经上传至sig仓库，链接如下：
https://gitee.com/openharmony-sig/knowledge_demo_temp/tree/master/fa/ocrdemo
3. 目录结构
4. 调用流程
调用过程主要涉及到三方面，首先应用层实现样例的效果，包括页面的布局和业务逻辑代码；中间层主要起桥梁的作用，提供n-api接口给应用调用，再通过三方库的接口去调用具体的实现；native层使用了三方库tesseract提供具体的实现功能。 5. 源码分析
本样例源码的分析主要涉及到两个方面，一方面是n-api接口的实现，另一方面是应用层的页面布局和业务逻辑。n-api实现 1. 首先在index.d.ts文件中定义好接口 /** * 初始化文字识别引擎 * @param lang 识别的语言, eg:eng、chi_sim、 eng+chi_sim,为null或不传则为中英文（eng+chi_sim） * @param traindir 训练模型目录，为null或不传则为默认目录 * * @return 初始化是否成功 0=>成功，-1=>失败 */export const initocr: (lang: string, traindir: string) => promise;export const initocr: (lang: string, traindir: string, callback: asynccallback) => void;/** * 开始识别 * @param imagepath 图片路径（当前支持的图片格式为png, jpg, tiff） * * @return 识别结果 */export const startocr: (imagepath: string) => promise;export const startocr: (imagepath: string, callback: asynccallback) => void;/** * 销毁资源 */export const destroyocr: () => void; 代码中可以看出n-api接口initocr和startocr都采用了两种方式，一种是promise，一种是callback的方式。在样例的应用层，使用的是它们的callback方式。 2 注册n-api模块和接口 extern_c_startstatic napi_value init(napi_env env, napi_value exports) {napi_property_descriptor desc[] = {{initocr, nullptr, initocr, nullptr, nullptr, nullptr, napi_default, nullptr},{startocr, nullptr, startocr, nullptr, nullptr, nullptr, napi_default, nullptr},{destroyocr, nullptr, destroyocr, nullptr, nullptr, nullptr, napi_default, nullptr},{};napi_define_properties(env, exports, sizeof(desc) / sizeof(desc[0]), desc);return exports;}extern_c_endstatic napi_module demomodule = {.nm_version = 1,.nm_flags = 0,.nm_filename = nullptr,.nm_register_func = init,.nm_modname = tesseract,.nm_priv = ((void *)0),.reserved = {0},};extern c __attribute__((constructor)) void registerhellomodule(void) {napi_module_register(& demomodule);} 通过nm_modname定义模块名，nm_register_func注册接口函数，在init函数中指定了js中initocr，startocr，destroyocr对应的本地实现函数，这样就可以在对应的本地实现函数中调用三方库tesseract的具体实现了。 3 以startocr的callback方式为例介绍n-api中的具体实现 static napi_value startocr(napi_env env, napi_callback_info info) { oh_log_error(logtype::log_app, ocr startocr 111); size_t argc = 2; napi_value args[2] = { nullptr }; //1. 获取参数 napi_get_cb_info(env, info, &argc, args, nullptr, nullptr); //2. 共享数据 auto addondata = new startocraddondata{ .asyncwork = nullptr, }; //3. n-api类型转成c/c++类型 char imagepath[1024] = { 0 }; size_t length = 0; napi_get_value_string_utf8(env, args[0], imagepath, 1024, &length); addondata->args0 = string(imagepath); napi_create_reference(env, args[1], 1, &addondata->callback); //4. 创建async work napi_value resourcename = nullptr; napi_create_string_utf8(env, startocr, napi_auto_length, &resourcename); napi_create_async_work(env, nullptr, resourcename, executestartocr, completestartocrforcallback, (void *)addondata, &addondata->asyncwork); //将创建的async work加到队列中，由底层调度执行 napi_queue_async_work(env, addondata->asyncwork); napi_value result = 0; napi_get_null(env, &result); return result;} 首先通过napi_get_cb_info方法获取js侧传入的参数信息，将参数转成c++对应的类型，然后创建异步工作，异步工作的方法参数中包含，执行的函数以及函数执行完成的回调函数。我们看一下执行函数 static void executestartocr(napi_env env, void* data) { //通过data来获取数据 startocraddondata * addondata = (startocraddondata *)data; napi_value resultvalue; try { if (api != nullptr) { //调用具体的实现，读取图片像素 pix * pix = pixread((const char*)addondata->args0.c_str()); //设置api的图片像素 api->setimage(pix); //调用文字提取接口，获取图片中的文字 char * result = api->getutf8text(); addondata->result = result; //释放资源 pixdestroy (& pix); delete[] result; } } catch (std::exception e) { std::string error = error: ; if (initresult != 0) { error += please first init tesseractocr.; } else { error += e.what(); } addondata->result = error; }} 这个方法中通过data获取js传入的参数，然后调用tesseract库中提供的接口，调用具体的文字提取功能，获取图片中的文字。执行完成后，会回调到completestartocrforcallback，在这个方法中会将执行函数中返回的结果转换为js的对应类型，然后通过callback的方式返回。 static void completestartocrforcallback(napi_env env, napi_status status, void * data) { startocraddondata * addondata = (startocraddondata *)data; napi_value callback = nullptr; napi_get_reference_value(env, addondata->callback, &callback); napi_value undefined = nullptr; napi_get_undefined(env, &undefined); napi_value result = nullptr; napi_create_string_utf8(env, addondata->result.c_str(), addondata->result.length(), &result); //执行回调函数 napi_value returnval = nullptr; napi_call_function(env, undefined, callback, 1, &result, &returnval); //删除napi_ref对象 if (addondata->callback != nullptr) { napi_delete_reference(env, addondata->callback); } //删除异步工作项 napi_delete_async_work(env, addondata->asyncwork); delete addondata;} 应用层实现应用层主要分为三个模块：动物图片文字识别，身份信息识别，提取文字到本地文件 1. 动物图片文字识别 build() { column() { row() { text('点击图片进行文字提取提取结果：').fontsize('30fp').fontcolor(color.blue) text(this.ocrresult).fontsize('50fp').fontcolor(color.red) }.margin('10vp').height('10%').alignitems(verticalalign.center) grid() { foreach(this.images, (item, index) => { griditem() { animalitem({ path1: item[0], path2: item[1] }); } }) } .padding({left: this.columnspace, right: this.columnspace}) .columnstemplate(1fr 1fr 1fr) // grid宽度均分成3份 .rowstemplate(1fr 1fr) // grid高度均分成2份 .rowsgap(this.rowspace) // 设置行间距 .columnsgap(this.columnspace) // 设置列间距 .width('100%') .height('90%') } .backgroundcolor(color.pink) } 布局主要使用了grid的网格布局，每个item都是对应的图片，通过点击图片可以对点击图片进行文字提取，将提取出的文字显示在标题栏。 2. 身份信息识别 build() { row() { column() { image('/common/idimages/aobamao.jpg') .onclick(() => { //点击图片进行信息识别 console.log('ocr begin dialog open 111'); this.ocrdialog.open(); toolutils.ocrresult(toolutils.aobamao, (result) => { console.log('111 ocr result = ' + result); this.result = result; this.ocrdialog.close(); }); }) .margin('10vp') .objectfit(imagefit.auto) .height('50%') image('/common/idimages/weixiaobao.jpg') .onclick(() => { //点击图片进行信息识别 this.ocrdialog.open(); toolutils.ocrresult(toolutils.weixiaobao, (result) => { console.log('111 ocr result = ' + result); this.result = result; this.ocrdialog.close(); }); }) .margin('10vp') .objectfit(imagefit.auto) .height('50%') } .width(this.screenwidth/2) .padding('20vp') column() { text(this.title).height('10%').fontsize('30fp').fontcolor(this.titlecolor) column() { text(this.result) .fontcolor('#0000ff') .fontsize('50fp') }.justifycontent(flexalign.center).alignitems(horizontalalign.center).height('90%') } .justifycontent(flexalign.start) .width('50%') } .width('100%') .height('100%') } 身份信息识别的布局最外层是一个水平布局，分为左右两部分，左边的子布局是垂直布局，里面是两张不同的身份证图片，右边子布局也是垂直布局，主要是标题区和识别结果的内容显示区。 3. 提取文字到本地文件 row() { column() { image('/common/save2fileimages/testimage1.png') .onclick(() => { //点击图片进行信息识别 toolutils.ocrresult(toolutils.testimage1, (result) => { let path = this.dir + 'ocrresult1.txt'; try { let fd = fileio.opensync(path, 0o100 | 0o2, 0o666); fileio.writesync(fd, result); fileio.closesync(fd); this.displaytext = '文件写入' + path; } catch (e) { console.log('ocr fileio error = ' + e); } }); }) image('/common/save2fileimages/testimage2.png') .onclick(() => { //点击图片进行信息识别 toolutils.ocrresult(toolutils.testimage2, (result) => { let path = this.dir + 'ocrresult2.txt'; let fd = fileio.opensync(path, 0o100 | 0o2, 0o666); fileio.writesync(fd, result); fileio.closesync(fd); this.displaytext = '文件写入' + path; }); }) } column() { text(this.title) column() { text(this.displaytext) } } } 这个功能首先通过接口识别出图片中的文字，然后再通过fileio的能力将文字写入文件中。 6. 总结
样例通过native的方式将c++的三方库集成到应用中，通过n-api方式提供接口给上层应用调用。对于依赖三方库能力的应用，都可以使用这种方式来进行，移植三方库到native，通过n-api提供接口给应用调用。关于样例开发，我之前还分享过《如何利用openharmony arkui的canvas组件实现涂鸦功能？》、《如何通过openharmony的音频模块实现录音变速功能？》欢迎感兴趣的开发者进行了解并与我交流样例开发经验。
原文标题：openharmony集成ocr三方库实现文字提取
文章出处：【微信公众号：openatom openharmony】欢迎添加关注！文章转载请注明出处。

XX nm制造工艺是什么概念？实现7nm制程工艺为什么这么困难？
三星S8真机已经定型！当与Note7摆在一起是什么感觉？
华为Mate10、荣耀9最新什么上市？华为信任危机过后，华为Mate10、华为荣耀9赶来救场
GSMA在引领着我国5G发展
如何利用ZWS云平台的自定义统计算法对数据进行统计？
OpenHarmony集成OCR三方库实现文字提取
小米多看电纸书开启预约，电子墨水屏搭载安卓8.1系统
今天的VR如同1895年的无声电影时代
这些创业公司将赢在自动驾驶汽车风口，成为改变世界的企业
使您的树莓派便携的4个项目介绍
STM32串口通信的重要性
科陆智能制造解决方案助力云南电网新型电力系统建设
铜线基础知识科普篇
数控机床的定义_数控机床有何优点
新能源汽车专用号牌有哪些_北京新能源汽车号牌最新消息
千方边缘计算单元产品介绍
一文就懂远端串扰与近端串扰
TIA Portal的程序块保护功能实现
我国带宽成本高出国外5倍成CDN最大开支
龙芯教育解决方案推动国产化设备进校园