enginex-bi_series-paddleocr/README.md

# enginex-bi_series-paddleocr

## Build docker image
```bash
docker build -t paddleocr:bi .
```
其中，基础镜像 corex:3.2.1-ubuntu20.04-py3.10-slim 通过联系天数智芯天垓100厂商技术支持可获取

## 测试
### 下载模型
支持 PP-OCRv4及以下版本
PP-OCRv4模型：
- det: https://paddleocr.bj.bcebos.com/PP-OCRv4/chinese/ch_PP-OCRv4_det_infer.tar
- rec: https://paddleocr.bj.bcebos.com/PP-OCRv4/chinese/ch_PP-OCRv4_rec_infer.tar
- cls: https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar

### 测试
模型放在`/mnt/contest_ceph/zhanghao/models/ocr/`下，运行下面的测试程序，可以识别出示例图片中的文字
```bash
./run_in_docker.sh python3 test.py
```

## OCR API Server
提供了一个`http` API server，可以通过服务的方式加载模型以及进行图片识别，启动命令如下：
```bash
./run_in_docker.sh python3 app.py
```
http接口如下：
```
post /predict
Content-Type: application/json
# Request body:
{
    files={'image': f},
    data= {'image_name': "test.jpg"} # 图片文件名
}
# Response:
{
    "success": True,
    "result": [] #json_list
}
```
json_list格式如下：
```json
[
    {
        "bbox": [0, 0, 100, 200], // 分别[左上角x，左上角y，右下角x，右下角y]
        "type": "",    // 图片区域的类型，目前仅支持 Title
        "content": "今天", // 不同类型的内容不同，但都是“块”里的内容，目前为文本内容
        "page": 1,     // 目前都是1
        "score": 0.9  // 版面分析，划出该 bbox 的 confidence 分数
    }
]
```
     
   - python请求示例
   ```python
    f = open(local_image_path, "rb")
    res =  requests.post(f"http://127.0.0.1:8080/predict", files={'image': f},data={'image_name': "a.jpg"}).json()
   ```
-												add ocr service

											
										
										
											2025-08-16 20:31:38 +08:00
+								# enginex-bi_series-paddleocr
-												add Dockerfile and whl package

											
										
										
											2025-08-18 16:21:14 +08:00
 								## Build docker image
 								```bash
 								docker build -t paddleocr:bi .
 								```
-												update Dockerfile

											
										
										
											2025-09-15 17:56:54 +08:00
+								其中，基础镜像 corex:3.2.1-ubuntu20.04-py3.10-slim 通过联系天数智芯天垓100厂商技术支持可获取
-												add Dockerfile and whl package

											
										
										
											2025-08-18 16:21:14 +08:00
 								## 测试
 								### 下载模型
 								支持 PP-OCRv4及以下版本
 								PP-OCRv4模型：
 								- det: https://paddleocr.bj.bcebos.com/PP-OCRv4/chinese/ch_PP-OCRv4_det_infer.tar
 								- rec: https://paddleocr.bj.bcebos.com/PP-OCRv4/chinese/ch_PP-OCRv4_rec_infer.tar
 								- cls: https://paddleocr.bj.bcebos.com/dygraph_v2.0/ch/ch_ppocr_mobile_v2.0_cls_infer.tar
 								### 测试
-												add run_in_docker scripts

											
										
										
											2025-08-29 14:54:34 +08:00
+								模型放在`/mnt/contest_ceph/zhanghao/models/ocr/`下，运行下面的测试程序，可以识别出示例图片中的文字
-												add Dockerfile and whl package

											
										
										
											2025-08-18 16:21:14 +08:00
+								```bash
-												add run_in_docker scripts

											
										
										
											2025-08-29 14:54:34 +08:00
+								./run_in_docker.sh python3 test.py
-												add Dockerfile and whl package

											
										
										
											2025-08-18 16:21:14 +08:00
+								```
 								## OCR API Server
 								提供了一个`http` API server，可以通过服务的方式加载模型以及进行图片识别，启动命令如下：
 								```bash
-												add run_in_docker scripts

											
										
										
											2025-08-29 14:54:34 +08:00
+								./run_in_docker.sh python3 app.py
 								```
-												更新 README.md

											
										
										
											2025-09-16 15:16:15 +08:00
+								http接口如下：
 								```
 								post /predict
 								Content-Type: application/json
 								# Request body:
 								{
 								    files={'image': f},
 								    data= {'image_name': "test.jpg"} # 图片文件名
 								}
 								# Response:
 								{
 								    "success": True,
 								    "result": [] #json_list
 								}
 								```
 								json_list格式如下：
 								```json
 								[
 								    {
 								        "bbox": [0, 0, 100, 200], // 分别[左上角x，左上角y，右下角x，右下角y]
 								        "type": "",    // 图片区域的类型，目前仅支持 Title
 								        "content": "今天", // 不同类型的内容不同，但都是“块”里的内容，目前为文本内容
 								        "page": 1,     // 目前都是1
 								        "score": 0.9  // 版面分析，划出该 bbox 的 confidence 分数
 								    }
 								]
 								```
 								   - python请求示例
 								   ```python
 								    f = open(local_image_path, "rb")
 								    res =  requests.post(f"http://127.0.0.1:8080/predict", files={'image': f},data={'image_name': "a.jpg"}).json()
 								   ```