python – 在扫描文档中分割文本行

我试图找到一种方法来打破已经被自适应阈值化的扫描文档中的文本行.现在,我将文档的像素值作为无符号整数从0到255存储,我正在取每行中的像素的平均值,并且根据像素值的平均值是否将行划分为范围大于250,然后我取其持有的每个行范围的中位数.但是,这种方法有时会失败,因为图像上可能会出现黑色斑点.

有没有更多的抗噪声方法来做这个任务？

编辑：这是一些代码. “扭曲”是原始图像的名称,“切割”是我想要分割图像的地方.

warped = threshold_adaptive(warped,250,offset = 10)
warped = warped.astype("uint8") * 255

# get areas where we can split image on whitespace to make OCR more accurate
color_level = np.array([np.sum(line) / len(line) for line in warped])
cuts = []
i = 0
while(i < len(color_level)):
    if color_level[i] > 250:
        begin = i
        while(color_level[i] > 250):
            i += 1
        cuts.append((i + begin)/2) # middle of the whitespace region
    else:
        i += 1

编辑2：添加样品图像

解决方法

从您的输入图像,您需要使文本为白色,背景为黑色

您需要计算帐单的旋转角度.一个简单的方法是找到所有白点的minAreaRect(findNonZero),你会得到：

然后,您可以轮换您的帐单,以便文字是水平的：

现在您可以计算水平投影(缩小).您可以在每行中取平均值.在直方图上应用阈值th以考虑图像中的某些噪声(这里我使用0,即没有噪声).仅具有背景的行将具有值> 0,文本行在直方图中将具有值0.然后取直方图中每个连续序列的白条的平均二进制位坐标.这将是你的线的y坐标：

这里的代码.它在C中,但由于大部分工作都是使用OpenCV功能,因此应该很容易地转换为Python.至少可以使用这个作为参考：

#include <opencv2/opencv.hpp>
using namespace cv;
using namespace std;

int main()
{
    // Read image
    Mat3b img = imread("path_to_image");

    // Binarize image. Text is white,background is black
    Mat1b bin;
    cvtColor(img,bin,COLOR_BGR2GRAY);
    bin = bin < 200;

    // Find all white pixels
    vector<Point> pts;
    findNonZero(bin,pts);

    // Get rotated rect of white pixels
    RotatedRect Box = minAreaRect(pts);
    if (Box.size.width > Box.size.height)
    {
        swap(Box.size.width,Box.size.height);
        Box.angle += 90.f;
    }

    Point2f vertices[4];
    Box.points(vertices);

    for (int i = 0; i < 4; ++i)
    {
        line(img,vertices[i],vertices[(i + 1) % 4],Scalar(0,255,0));
    }

    // Rotate the image according to the found angle
    Mat1b rotated;
    Mat M = getRotationMatrix2D(Box.center,Box.angle,1.0);
    warpAffine(bin,rotated,M,bin.size());

    // Compute horizontal projections
    Mat1f horProj;
    reduce(rotated,horProj,1,CV_REDUCE_AVG);

    // Remove noise in histogram. White bins identify space lines,black bins identify text lines
    float th = 0;
    Mat1b hist = horProj <= th;

    // Get mean coordinate of white white pixels groups
    vector<int> ycoords;
    int y = 0;
    int count = 0;
    bool isSpace = false;
    for (int i = 0; i < rotated.rows; ++i)
    {
        if (!isSpace)
        {
            if (hist(i))
            {
                isSpace = true;
                count = 1;
                y = i;
            }
        }
        else
        {
            if (!hist(i))
            {
                isSpace = false;
                ycoords.push_back(y / count);
            }
            else
            {
                y += i;
                count++;
            }
        }
    }

    // Draw line as final result
    Mat3b result;
    cvtColor(rotated,result,COLOR_GRAY2BGR);
    for (int i = 0; i < ycoords.size(); ++i)
    {
        line(result,Point(0,ycoords[i]),Point(result.cols,0));
    }

    return 0;
}

python – 在扫描文档中分割文本行

解决方法

猜你在找的Python相关文章