C# OCR Image to Text Tutorial: Convert Images to Text Without Tesseract

Q: 저화질 이미지에서 OCR 정확도를 높이려면 어떻게 해야 할까요?

IronOCR은 Input.Deskew() 및 Input.DeNoise() 와 같은 이미지 필터를 제공하여 이미지의 기울기를 보정하고 노이즈를 줄여 OCR 정확도를 크게 향상시킬 수 있습니다.

Q: C#에서 OCR을 사용하여 여러 페이지로 구성된 문서에서 텍스트를 추출하는 단계는 무엇입니까?

IronOCR은 여러 페이지로 구성된 문서에서 텍스트를 추출하기 위해 PDF의 경우 LoadPdf() 와 같은 메서드를 사용하거나 TIFF 파일을 처리하여 각 페이지를 텍스트로 변환할 수 있도록 지원합니다.

Q: 이미지에서 바코드와 텍스트를 동시에 읽는 것이 가능할까요?

네, IronOCR은 단일 이미지에서 텍스트와 바코드를 모두 읽을 수 있습니다. ocr.Configuration.ReadBarCodes = true 로 설정하면 바코드 읽기를 활성화하여 텍스트와 바코드 데이터를 모두 추출할 수 있습니다.

Q: 여러 언어로 된 문서를 처리하도록 OCR을 어떻게 설정할 수 있나요?

IronOCR은 125개 이상의 언어를 지원하며, ocr.Language 사용하여 기본 언어를 설정하고 ocr.AddSecondaryLanguage() 를 사용하여 추가 언어를 추가하여 다국어 문서를 처리할 수 있습니다.

Q: OCR 결과를 다양한 형식으로 내보낼 수 있는 방법에는 어떤 것들이 있습니까?

IronOCR은 PDF 형식의 SaveAsSearchablePdf() , 일반 텍스트 형식의 SaveAsTextFile() , HOCR HTML 형식의 SaveAsHocrFile() 등 OCR 결과를 내보내는 여러 가지 방법을 제공합니다.

Q: 대용량 이미지 파일의 OCR 처리 속도를 최적화하려면 어떻게 해야 할까요?

OCR 처리 속도를 최적화하려면 IronOCR의 OcrLanguage.EnglishFast 사용하여 언어 인식 속도를 높이고 System.Drawing.Rectangle 사용하여 OCR을 수행할 특정 영역을 정의하여 처리 시간을 단축하십시오.

Q: 보호된 PDF 파일의 OCR 처리는 어떻게 해야 하나요?

보호된 PDF 파일을 처리할 때는 LoadPdf() 메서드와 올바른 암호를 사용하십시오. IronOCR은 이미지 기반 PDF 파일의 페이지를 자동으로 이미지로 변환하여 OCR 처리를 수행합니다.

Q: OCR 결과가 정확하지 않으면 어떻게 해야 하나요?

OCR 결과가 정확하지 않은 경우 IronOCR의 이미지 개선 기능(예: Input.Deskew() 및 Input.DeNoise() 을 사용하고 올바른 언어 팩이 설치되어 있는지 확인하십시오.

Q: OCR 프로세스에서 특정 문자를 제외하도록 사용자 지정할 수 있습니까?

네, IronOCR은 BlackListCharacters 속성을 사용하여 특정 문자를 제외함으로써 OCR 프로세스를 사용자 지정할 수 있습니다. 이를 통해 관련 텍스트에만 집중하여 정확도와 처리 속도를 향상시킬 수 있습니다.

제이콥 멜러

업데이트됨:1월 20, 2026

Translated

View the article in English

복잡한 Tesseract 설정 없이 C#에서 이미지를 텍스트로 변환하고 싶으신가요? 이 종합적인 IronOCR C# 튜토리얼은 단 몇 줄의 코드로 .NET 애플리케이션에 강력한 광학 문자 인식(OCR) 기능을 구현하는 방법을 보여줍니다.

빠른 시작: 이미지에서 한 줄로 텍스트 추출

이 예시는 IronOCR 얼마나 쉽게 이해할 수 있는지 보여줍니다. 단 한 줄의 C# 코드로 이미지를 텍스트로 변환할 수 있습니다. 이 영상은 복잡한 설정 없이 OCR 엔진을 초기화하고 즉시 텍스트를 읽고 검색하는 방법을 보여줍니다.

NuGet 패키지 관리자를 사용하여 https://www.nuget.org/packages/IronOcr 설치하기
PM > Install-Package IronOcr

다음 코드 조각을 복사하여 실행하세요.

string text = new IronTesseract().Read("image.png").Text;

실제 운영 환경에서 테스트할 수 있도록 배포하세요.

무료 체험판으로 오늘 프로젝트에서 IronOCR 사용 시작하기

최소 워크플로우(5단계)

이미지를 텍스트로 변환하는 C# OCR 라이브러리인 IronOCR 다운로드하세요.
IronTesseract 클래스를 사용하여 이미지에서 텍스트를 즉시 읽어오세요.
이미지 필터를 적용하여 저품질 스캔 이미지의 OCR 정확도를 향상시키세요.
다운로드 가능한 언어 팩을 사용하여 여러 언어를 처리할 수 있습니다.
검색 가능한 PDF 파일로 결과를 내보내거나 텍스트 문자열을 추출합니다.

.NET 애플리케이션에서 이미지에서 텍스트를 읽는 방법은 무엇인가요?

.NET 애플리케이션에서 C# OCR 이미지-텍스트 변환 기능을 구현하려면 안정적인 OCR 라이브러리가 필요합니다. IronOCR은 IronOcr.IronTesseract 클래스를 사용하는 관리형 솔루션을 제공하며, 외부 종속성 없이 정확도와 속도를 모두 극대화합니다.

먼저 Visual Studio 프로젝트에 IronOCR 설치하세요. IronOCR DLL 파일을 직접 다운로드하거나 NuGet 패키지 관리자를 사용할 수 있습니다.

Install-Package IronOcr

Tesseract 없이 C# OCR을 구현하기 위해 IronOCR 선택해야 하는 이유는 무엇일까요?

C#에서 이미지를 텍스트로 변환해야 할 때, IronOCR 기존 Tesseract 구현 방식보다 훨씬 유리한 점을 제공합니다.

순수 .NET 환경에서 즉시 작동합니다.
Tesseract 설치 또는 구성이 필요하지 않습니다.
최신 엔진 실행 가능: Tesseract 5 (Tesseract 4 및 3 포함)
.NET Framework 4.6.2 이상, .NET Standard 2 이상, .NET Core 2, 3, 5, 6, 7, 8, 9 및 10과 호환됩니다.
기존 테서랙트에 비해 정확도와 속도가 향상됩니다.
Xamarin, Mono, Azure 및 Docker 배포를 지원합니다.
NuGet 패키지를 통해 복잡한 Tesseract 사전을 관리합니다.
PDF, 멀티프레임 TIFF 및 모든 주요 이미지 형식을 자동으로 처리합니다.
최적의 결과를 위해 품질이 낮거나 왜곡된 스캔 이미지를 보정합니다.

IronOCR C# 튜토리얼을 통해 기본적인 OCR 사용법을 알아보세요.

이 Iron Tesseract C# 예제는 IronOCR 사용하여 이미지에서 텍스트를 읽는 가장 간단한 방법을 보여줍니다. IronOcr.IronTesseract 클래스는 텍스트를 추출하여 문자열로 반환합니다.

// Basic C# OCR image to text conversion using IronOCR
// This example shows how to extract text from images without complex setup

using IronOcr;
using System;

try
{
    // Initialize IronTesseract for OCR operations
    var ocrEngine = new IronTesseract();

    // Path to your image file - supports PNG, JPG, TIFF, BMP, and more
    var imagePath = @"img\Screenshot.png";

    // Create input and perform OCR to convert image to text
    using (var input = new OcrInput(imagePath))
    {
        // Read text from image and get results
        OcrResult result = ocrEngine.Read(input);

        // Display extracted text
        Console.WriteLine(result.Text);
    }
}
catch (OcrException ex)
{
    // Handle OCR-specific errors
    Console.WriteLine($"OCR Error: {ex.Message}");
}
catch (Exception ex)
{
    // Handle general errors
    Console.WriteLine($"Error: {ex.Message}");
}

// Basic C# OCR image to text conversion using IronOCR
// This example shows how to extract text from images without complex setup

using IronOcr;
using System;

try
{
    // Initialize IronTesseract for OCR operations
    var ocrEngine = new IronTesseract();

    // Path to your image file - supports PNG, JPG, TIFF, BMP, and more
    var imagePath = @"img\Screenshot.png";

    // Create input and perform OCR to convert image to text
    using (var input = new OcrInput(imagePath))
    {
        // Read text from image and get results
        OcrResult result = ocrEngine.Read(input);

        // Display extracted text
        Console.WriteLine(result.Text);
    }
}
catch (OcrException ex)
{
    // Handle OCR-specific errors
    Console.WriteLine($"OCR Error: {ex.Message}");
}
catch (Exception ex)
{
    // Handle general errors
    Console.WriteLine($"Error: {ex.Message}");
}

' Basic C# OCR image to text conversion using IronOCR
' This example shows how to extract text from images without complex setup

Imports IronOcr
Imports System

Try
	' Initialize IronTesseract for OCR operations
	Dim ocrEngine = New IronTesseract()

	' Path to your image file - supports PNG, JPG, TIFF, BMP, and more
	Dim imagePath = "img\Screenshot.png"

	' Create input and perform OCR to convert image to text
	Using input = New OcrInput(imagePath)
		' Read text from image and get results
		Dim result As OcrResult = ocrEngine.Read(input)

		' Display extracted text
		Console.WriteLine(result.Text)
	End Using
Catch ex As OcrException
	' Handle OCR-specific errors
	Console.WriteLine($"OCR Error: {ex.Message}")
Catch ex As Exception
	' Handle general errors
	Console.WriteLine($"Error: {ex.Message}")
End Try

$vbLabelText $csharpLabel

이 코드는 선명한 이미지에서 100% 정확도를 달성하며, 텍스트를 보이는 그대로 추출합니다.

IronOCR Simple Example

In this simple example we test the accuracy of our C# OCR library to read text from a PNG Image. This is a very basic test, but things will get more complicated as the tutorial continues.

The quick brown fox jumps over the lazy dog

IronTesseract 클래스는 내부적으로 복잡한 OCR 작업을 처리합니다. 이 시스템은 정렬을 자동으로 스캔하고 해상도를 최적화하며, 인공지능을 사용하여 IronOCR 로 사람 수준의 정확도로 이미지에서 텍스트를 읽어냅니다.

이미지 분석, 엔진 최적화, 지능형 텍스트 인식 등 정교한 처리 과정이 백그라운드에서 진행됨에도 불구하고, OCR 프로세스는 뛰어난 정확도를 유지하면서 인간의 읽기 속도와 동일한 속도를 제공합니다.

IronOCR: C#을 사용하여 100% 정확도로 이미지를 텍스트로 변환하는 간단한 예시 IronOCR이 PNG 이미지에서 텍스트를 완벽한 정확도로 추출하는 기능을 보여주는 스크린샷입니다.

Tesseract 설정 없이 고급 C# OCR을 구현하는 방법은 무엇인가요?

C#에서 이미지를 텍스트로 변환할 때 최적의 성능이 필요한 프로덕션 애플리케이션의 경우, OcrInput 및 IronTesseract 클래스를 함께 사용하십시오. 이 접근 방식은 OCR 프로세스에 대한 세밀한 제어를 제공합니다.

OcrInput 클래스 특징

JPEG, TIFF, GIF, BMP, PNG 등 다양한 이미지 형식을 처리합니다.
PDF 파일 전체 또는 특정 페이지를 가져옵니다.
명암, 해상도 및 이미지 품질을 자동으로 향상시킵니다.
회전, 스캔 노이즈, 기울기 및 음상 이미지를 수정합니다.

IronTesseract급 함선 특징

127개 이상의 사전 패키지 언어 지원
테서랙트 5, 4, 3 엔진 포함
문서 유형 지정 (스크린샷, 코드 조각 또는 전체 문서)
바코드 판독 기능 통합
다양한 출력 형식 지원: 검색 가능한 PDF, HOCR HTML, DOM 객체 및 문자열

OcrInput과 IronTesseract를 시작하는 방법은 무엇인가요?

다음은 대부분의 문서 유형에서 잘 작동하는 IronOCR C# 튜토리얼에 권장되는 구성입니다.

using IronOcr;

// Initialize IronTesseract for advanced OCR operations
IronTesseract ocr = new IronTesseract();

// Create input container for processing multiple images
using (OcrInput input = new OcrInput())
{
    // Process specific pages from multi-page TIFF files
    int[] pageIndices = new int[] { 1, 2 };

    // Load TIFF frames - perfect for scanned documents
    input.LoadImageFrames(@"img\Potter.tiff", pageIndices);

    // Execute OCR to read text from image using IronOCR
    OcrResult result = ocr.Read(input);

    // Output the extracted text
    Console.WriteLine(result.Text);
}

using IronOcr;

// Initialize IronTesseract for advanced OCR operations
IronTesseract ocr = new IronTesseract();

// Create input container for processing multiple images
using (OcrInput input = new OcrInput())
{
    // Process specific pages from multi-page TIFF files
    int[] pageIndices = new int[] { 1, 2 };

    // Load TIFF frames - perfect for scanned documents
    input.LoadImageFrames(@"img\Potter.tiff", pageIndices);

    // Execute OCR to read text from image using IronOCR
    OcrResult result = ocr.Read(input);

    // Output the extracted text
    Console.WriteLine(result.Text);
}

Imports IronOcr

' Initialize IronTesseract for advanced OCR operations
Private ocr As New IronTesseract()

' Create input container for processing multiple images
Using input As New OcrInput()
	' Process specific pages from multi-page TIFF files
	Dim pageIndices() As Integer = { 1, 2 }

	' Load TIFF frames - perfect for scanned documents
	input.LoadImageFrames("img\Potter.tiff", pageIndices)

	' Execute OCR to read text from image using IronOCR
	Dim result As OcrResult = ocr.Read(input)

	' Output the extracted text
	Console.WriteLine(result.Text)
End Using

$vbLabelText $csharpLabel

이 구성은 중간 품질의 스캔에서 거의 완벽에 가까운 정확도를 일관되게 달성합니다. LoadImageFrames 메서드는 여러 페이지로 구성된 문서를 효율적으로 처리하므로 일괄 처리 시나리오에 이상적입니다.

IronOCR의 여러 페이지 텍스트 추출 기능을 보여주는 TIFF 문서 샘플입니다.

TIFF 파일과 같은 스캔 문서의 이미지와 바코드에서 텍스트를 읽어내는 기능은 IronOCR 복잡한 OCR 작업을 얼마나 간소화하는지 보여줍니다. 이 라이브러리는 실제 문서 처리 능력이 뛰어나며, 여러 페이지로 구성된 TIFF 파일과 PDF 텍스트 추출을 원활하게 처리합니다.

IronOCR 저품질 스캔 파일을 어떻게 처리하나요?

이미지 필터를 사용하면 IronOCR 정확하게 처리할 수 있는 노이즈가 포함된 저해상도 문서입니다.

왜곡 및 디지털 노이즈가 포함된 불완전한 스캔 파일을 처리할 때 IronOCR 다른 C# OCR 라이브러리보다 뛰어난 성능을 보입니다 . 이 시스템은 완벽한 테스트 이미지보다는 실제 시나리오에 맞춰 특별히 설계되었습니다.

// Advanced Iron Tesseract C# example for low-quality images
using IronOcr;
using System;

var ocr = new IronTesseract();

try
{
    using (var input = new OcrInput())
    {
        // Load specific pages from poor-quality TIFF
        var pageIndices = new int[] { 0, 1 };
        input.LoadImageFrames(@"img\Potter.LowQuality.tiff", pageIndices);

        // Apply deskew filter to correct rotation and perspective
        input.Deskew(); // Critical for improving accuracy on skewed scans

        // Perform OCR with enhanced preprocessing
        OcrResult result = ocr.Read(input);

        // Display results
        Console.WriteLine("Recognized Text:");
        Console.WriteLine(result.Text);
    }
}
catch (Exception ex)
{
    Console.WriteLine($"Error during OCR: {ex.Message}");
}

// Advanced Iron Tesseract C# example for low-quality images
using IronOcr;
using System;

var ocr = new IronTesseract();

try
{
    using (var input = new OcrInput())
    {
        // Load specific pages from poor-quality TIFF
        var pageIndices = new int[] { 0, 1 };
        input.LoadImageFrames(@"img\Potter.LowQuality.tiff", pageIndices);

        // Apply deskew filter to correct rotation and perspective
        input.Deskew(); // Critical for improving accuracy on skewed scans

        // Perform OCR with enhanced preprocessing
        OcrResult result = ocr.Read(input);

        // Display results
        Console.WriteLine("Recognized Text:");
        Console.WriteLine(result.Text);
    }
}
catch (Exception ex)
{
    Console.WriteLine($"Error during OCR: {ex.Message}");
}

' Advanced Iron Tesseract C# example for low-quality images
Imports IronOcr
Imports System

Private ocr = New IronTesseract()

Try
	Using input = New OcrInput()
		' Load specific pages from poor-quality TIFF
		Dim pageIndices = New Integer() { 0, 1 }
		input.LoadImageFrames("img\Potter.LowQuality.tiff", pageIndices)

		' Apply deskew filter to correct rotation and perspective
		input.Deskew() ' Critical for improving accuracy on skewed scans

		' Perform OCR with enhanced preprocessing
		Dim result As OcrResult = ocr.Read(input)

		' Display results
		Console.WriteLine("Recognized Text:")
		Console.WriteLine(result.Text)
	End Using
Catch ex As Exception
	Console.WriteLine($"Error during OCR: {ex.Message}")
End Try

$vbLabelText $csharpLabel

Input.Deskew()를 사용하면 저품질 스캔에서도 정확도가 99.8%로 향상되어 고품질 결과와 거의 비슷한 수준을 보입니다. 이는 IronOCR 이 Tesseract 관련 문제 없이 C# OCR을 구현하는 데 선호되는 이유입니다.

이미지 필터는 처리 시간을 약간 증가시킬 수 있지만 전체 OCR 소요 시간을 크게 단축시킬 수 있습니다. 적절한 균형을 찾는 것은 문서 품질에 달려 있습니다.

대부분의 경우, Input.Deskew() 및 Input.DeNoise()을 사용하면 OCR 성능이 확실하게 향상됩니다. 이미지 전처리 기술 에 대해 자세히 알아보세요.

OCR 성능과 속도를 최적화하는 방법은 무엇일까요?

C#에서 이미지를 텍스트로 변환할 때 OCR 속도에 가장 큰 영향을 미치는 요소는 입력 데이터의 품질입니다. 노이즈를 최소화하면서 높은 DPI(약 200dpi)를 사용하면 가장 빠르고 정확한 결과를 얻을 수 있습니다.

IronOCR 불완전한 문서를 수정하는 데 탁월하지만, 이 기능을 추가하려면 처리 시간이 더 필요합니다.

압축으로 인한 화질 저하가 최소화된 이미지 형식을 선택하세요. TIFF와 PNG는 디지털 노이즈가 적기 때문에 일반적으로 JPEG보다 처리 속도가 빠릅니다.

어떤 이미지 필터가 OCR 속도를 향상시키나요?

다음 필터는 C# OCR 이미지-텍스트 변환 워크플로의 성능을 크게 향상시킬 수 있습니다.

OcrInput.Rotate(double degrees): 이미지를 시계 방향으로 회전합니다(반대 방향인 경우 음수).
OcrInput.Binarize(): 흑백으로 변환하여 대비가 낮은 환경에서 성능을 향상시킵니다
OcrInput.ToGrayScale(): 속도 향상을 위해 그레이스케일로 변환
OcrInput.Contrast(): 정확도를 높이기 위해 대비를 자동으로 조정합니다
OcrInput.DeNoise(): 노이즈가 예상되는 경우 디지털 아티팩트를 제거합니다
OcrInput.Invert(): 검은색 바탕에 흰색 텍스트로 색상을 반전시킵니다
OcrInput.Dilate(): 텍스트 경계를 확장합니다
OcrInput.Erode(): 텍스트 경계 줄임
OcrInput.Deskew(): 정렬 수정 - 문서가 비뚤어진 경우 필수
OcrInput.DeepCleanBackgroundNoise(): 강력한 노이즈 제거
OcrInput.EnhanceResolution: 저해상도 이미지 품질 개선
OcrInput.DetectPageOrientation(): 페이지 회전 감지 및 수정. OrientationDetectionMode을 전달하여 정확도와 속도의 균형을 조절하십시오: Fast, Balanced, Detailed 또는 ExtremeDetailed (v2025.8.6 추가)

Scale() 및 EnhanceResolution()은 v2025.12.3 버전의 알려진 문제로 인해 SaveAsSearchablePdf()과 호환되지 않습니다. 그 외의 모든 필터는 검색 가능한 PDF 출력에서 정상적으로 작동합니다.

IronOCR 속도를 최대화하려면 어떻게 설정해야 할까요?

고품질 스캔 처리 속도를 최적화하려면 다음 설정을 사용하십시오.

using IronOcr;

// Configure for speed - ideal for clean documents
IronTesseract ocr = new IronTesseract();

// Exclude problematic characters to speed up recognition
ocr.Configuration.BlackListCharacters = "~`$#^*_{[]}|\\";

// Use automatic page segmentation
ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.Auto;

// Select fast English language pack
ocr.Language = OcrLanguage.EnglishFast;

using (OcrInput input = new OcrInput())
{
    // Load specific pages from document
    int[] pageIndices = new int[] { 1, 2 };
    input.LoadImageFrames(@"img\Potter.tiff", pageIndices);

    // Read with optimized settings
    OcrResult result = ocr.Read(input);
    Console.WriteLine(result.Text);
}

using IronOcr;

// Configure for speed - ideal for clean documents
IronTesseract ocr = new IronTesseract();

// Exclude problematic characters to speed up recognition
ocr.Configuration.BlackListCharacters = "~`$#^*_{[]}|\\";

// Use automatic page segmentation
ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.Auto;

// Select fast English language pack
ocr.Language = OcrLanguage.EnglishFast;

using (OcrInput input = new OcrInput())
{
    // Load specific pages from document
    int[] pageIndices = new int[] { 1, 2 };
    input.LoadImageFrames(@"img\Potter.tiff", pageIndices);

    // Read with optimized settings
    OcrResult result = ocr.Read(input);
    Console.WriteLine(result.Text);
}

Imports IronOcr

' Configure for speed - ideal for clean documents
Private ocr As New IronTesseract()

' Exclude problematic characters to speed up recognition
ocr.Configuration.BlackListCharacters = "~`$#^*_{[]}|\"

' Use automatic page segmentation
ocr.Configuration.PageSegmentationMode = TesseractPageSegmentationMode.Auto

' Select fast English language pack
ocr.Language = OcrLanguage.EnglishFast

Using input As New OcrInput()
	' Load specific pages from document
	Dim pageIndices() As Integer = { 1, 2 }
	input.LoadImageFrames("img\Potter.tiff", pageIndices)

	' Read with optimized settings
	Dim result As OcrResult = ocr.Read(input)
	Console.WriteLine(result.Text)
End Using

$vbLabelText $csharpLabel

이 최적화된 설정은 기본 설정 대비 35% 속도 향상을 달성하면서 99.8%의 정확도를 유지합니다.

C# OCR을 사용하여 이미지의 특정 영역을 읽는 방법은 무엇입니까?

아래의 Iron Tesseract C# 예제는 System.Drawing.Rectangle을 사용하여 특정 지역을 대상으로 하는 방법을 보여줍니다. 이 기술은 텍스트가 예측 가능한 위치에 나타나는 표준화된 양식을 처리하는 데 매우 유용합니다.

IronOCR 잘라낸 영역을 처리하여 더 빠른 결과를 얻을 수 있습니까?

픽셀 기반 좌표를 사용하면 OCR 적용 범위를 특정 영역으로 제한하여 속도를 크게 향상시키고 원치 않는 텍스트 추출을 방지할 수 있습니다.

using IronOcr;
using IronSoftware.Drawing;

// Initialize OCR engine for targeted region processing
var ocr = new IronTesseract();

using (var input = new OcrInput())
{
    // Define exact region for OCR - coordinates in pixels
    var contentArea = new System.Drawing.Rectangle(
        x: 215, 
        y: 1250, 
        width: 1335, 
        height: 280
    );

    // Load image with specific area - perfect for forms and invoices
    input.LoadImage("img/ComSci.png", contentArea);

    // Process only the defined region
    OcrResult result = ocr.Read(input);
    Console.WriteLine(result.Text);
}

using IronOcr;
using IronSoftware.Drawing;

// Initialize OCR engine for targeted region processing
var ocr = new IronTesseract();

using (var input = new OcrInput())
{
    // Define exact region for OCR - coordinates in pixels
    var contentArea = new System.Drawing.Rectangle(
        x: 215, 
        y: 1250, 
        width: 1335, 
        height: 280
    );

    // Load image with specific area - perfect for forms and invoices
    input.LoadImage("img/ComSci.png", contentArea);

    // Process only the defined region
    OcrResult result = ocr.Read(input);
    Console.WriteLine(result.Text);
}

Imports IronOcr
Imports IronSoftware.Drawing

' Initialize OCR engine for targeted region processing
Dim ocr As New IronTesseract()

Using input As New OcrInput()
    ' Define exact region for OCR - coordinates in pixels
    Dim contentArea As New System.Drawing.Rectangle(215, 1250, 1335, 280)

    ' Load image with specific area - perfect for forms and invoices
    input.LoadImage("img/ComSci.png", contentArea)

    ' Process only the defined region
    Dim result As OcrResult = ocr.Read(input)
    Console.WriteLine(result.Text)
End Using

$vbLabelText $csharpLabel

이러한 맞춤형 접근 방식은 관련 텍스트만 추출하면서 속도를 41% 향상 시킵니다. 송장 , 수표, 양식과 같은 정형화된 문서에 이상적입니다. 동일한 자르기 기법이 PDF OCR 작업 에서도 완벽하게 작동합니다.

C#에서 OCR 대상 영역 추출을 보여주는 컴퓨터 과학 문서 IronOCR의 사각형 선택 기능을 사용하여 정확한 영역 기반 텍스트 추출을 보여주는 문서입니다.

IronOCR 몇 개의 언어를 지원하나요?

IronOCR은 편리한 언어 팩을 통해 127개 국어를 지원합니다. 해당 파일들을 DLL 파일로 저희 웹사이트 또는NuGet 패키지 관리자를 통해 다운로드하세요.

NuGet 인터페이스( 검색창에 "IronOCR" 입력 )를 통해 언어 팩을 설치하거나 전체 언어 팩 목록을 참조하세요.

지원되는 언어에는 아랍어, 중국어(간체/번체), 일본어, 한국어, 힌디어, 러시아어, 독일어, 프랑스어, 스페인어 및 115개 이상의 기타 언어가 포함되며, 각 언어는 정확한 텍스트 인식을 위해 최적화되어 있습니다.

다국어 OCR 구현 방법은 무엇인가요?

이 IronOCR C# 튜토리얼 예제는 아랍어 텍스트 인식을 보여줍니다.

Install-Package IronOcr.Languages.Arabic

IronOCR GIF 이미지에서 아랍어 텍스트를 정확하게 추출합니다.

// Install-Package IronOcr.Languages.Arabic
using IronOcr;

// Configure for Arabic language OCR
var ocr = new IronTesseract();
ocr.Language = OcrLanguage.Arabic;

using (var input = new OcrInput())
{
    // Load Arabic text image
    input.LoadImage("img/arabic.gif");

    // IronOCR handles low-quality Arabic text that standard Tesseract cannot
    var result = ocr.Read(input);

    // Save to file (console may not display Arabic correctly)
    result.SaveAsTextFile("arabic.txt");
}

// Install-Package IronOcr.Languages.Arabic
using IronOcr;

// Configure for Arabic language OCR
var ocr = new IronTesseract();
ocr.Language = OcrLanguage.Arabic;

using (var input = new OcrInput())
{
    // Load Arabic text image
    input.LoadImage("img/arabic.gif");

    // IronOCR handles low-quality Arabic text that standard Tesseract cannot
    var result = ocr.Read(input);

    // Save to file (console may not display Arabic correctly)
    result.SaveAsTextFile("arabic.txt");
}

Imports IronOcr

' Configure for Arabic language OCR
Dim ocr As New IronTesseract()
ocr.Language = OcrLanguage.Arabic

Using input As New OcrInput()
    ' Load Arabic text image
    input.LoadImage("img/arabic.gif")

    ' IronOCR handles low-quality Arabic text that standard Tesseract cannot
    Dim result = ocr.Read(input)

    ' Save to file (console may not display Arabic correctly)
    result.SaveAsTextFile("arabic.txt")
End Using

$vbLabelText $csharpLabel

IronOCR 다국어 문서를 처리할 수 있습니까?

문서에 여러 언어가 혼합되어 있는 경우 IronOCR 에서 다국어 지원을 설정하십시오.

Install-Package IronOcr.Languages.ChineseSimplified

// Multi-language OCR configuration
using IronOcr;

var ocr = new IronTesseract();

// Set primary language
ocr.Language = OcrLanguage.ChineseSimplified;

// Add secondary languages as needed
ocr.AddSecondaryLanguage(OcrLanguage.English);

// Custom .traineddata files can be added for specialized recognition
// ocr.AddSecondaryLanguage("path/to/custom.traineddata");

using (var input = new OcrInput())
{
    // Process multi-language document
    input.LoadImage("img/MultiLanguage.jpeg");

    var result = ocr.Read(input);
    result.SaveAsTextFile("MultiLanguage.txt");
}

// Multi-language OCR configuration
using IronOcr;

var ocr = new IronTesseract();

// Set primary language
ocr.Language = OcrLanguage.ChineseSimplified;

// Add secondary languages as needed
ocr.AddSecondaryLanguage(OcrLanguage.English);

// Custom .traineddata files can be added for specialized recognition
// ocr.AddSecondaryLanguage("path/to/custom.traineddata");

using (var input = new OcrInput())
{
    // Process multi-language document
    input.LoadImage("img/MultiLanguage.jpeg");

    var result = ocr.Read(input);
    result.SaveAsTextFile("MultiLanguage.txt");
}

Imports IronOcr

' Multi-language OCR configuration
Dim ocr As New IronTesseract()

' Set primary language
ocr.Language = OcrLanguage.ChineseSimplified

' Add secondary languages as needed
ocr.AddSecondaryLanguage(OcrLanguage.English)

' Custom .traineddata files can be added for specialized recognition
' ocr.AddSecondaryLanguage("path/to/custom.traineddata")

Using input As New OcrInput()
    ' Process multi-language document
    input.LoadImage("img/MultiLanguage.jpeg")

    Dim result = ocr.Read(input)
    result.SaveAsTextFile("MultiLanguage.txt")
End Using

$vbLabelText $csharpLabel

C# OCR을 사용하여 여러 페이지로 구성된 문서를 처리하는 방법은 무엇입니까?

IronOCR은 여러 페이지나 이미지를 하나의 OcrResult로 매끄럽게 결합합니다. 이 기능은 검색 가능한 PDF 생성 및 전체 문서 세트에서 텍스트 추출과 같은 강력한 기능을 가능하게 합니다.

이미지, TIFF 프레임, PDF 페이지 등 다양한 소스를 하나의 OCR 작업에서 조합하여 사용하세요.

// Multi-source document processing
using IronOcr;

IronTesseract ocr = new IronTesseract();

using (OcrInput input = new OcrInput())
{
    // Add various image formats
    input.LoadImage("image1.jpeg");
    input.LoadImage("image2.png");

    // Process specific frames from multi-frame images
    int[] frameNumbers = { 1, 2 };
    input.LoadImageFrames("image3.gif", frameNumbers);

    // Process all sources together
    OcrResult result = ocr.Read(input);

    // Verify page count
    Console.WriteLine($"{result.Pages.Count} Pages processed.");
}

// Multi-source document processing
using IronOcr;

IronTesseract ocr = new IronTesseract();

using (OcrInput input = new OcrInput())
{
    // Add various image formats
    input.LoadImage("image1.jpeg");
    input.LoadImage("image2.png");

    // Process specific frames from multi-frame images
    int[] frameNumbers = { 1, 2 };
    input.LoadImageFrames("image3.gif", frameNumbers);

    // Process all sources together
    OcrResult result = ocr.Read(input);

    // Verify page count
    Console.WriteLine($"{result.Pages.Count} Pages processed.");
}

Imports IronOcr

' Multi-source document processing
Dim ocr As New IronTesseract()

Using input As New OcrInput()
    ' Add various image formats
    input.LoadImage("image1.jpeg")
    input.LoadImage("image2.png")

    ' Process specific frames from multi-frame images
    Dim frameNumbers As Integer() = {1, 2}
    input.LoadImageFrames("image3.gif", frameNumbers)

    ' Process all sources together
    Dim result As OcrResult = ocr.Read(input)

    ' Verify page count
    Console.WriteLine($"{result.Pages.Count} Pages processed.")
End Using

$vbLabelText $csharpLabel

TIFF 파일의 모든 페이지를 효율적으로 처리합니다.

using IronOcr;

IronTesseract ocr = new IronTesseract();

using (OcrInput input = new OcrInput())
{
    // Define pages to process (0-based indexing)
    int[] pageIndices = new int[] { 0, 1 };

    // Load specific TIFF frames
    input.LoadImageFrames("MultiFrame.Tiff", pageIndices);

    // Extract text from all frames
    OcrResult result = ocr.Read(input);

    Console.WriteLine(result.Text);
    Console.WriteLine($"{result.Pages.Count} Pages processed");
}

using IronOcr;

IronTesseract ocr = new IronTesseract();

using (OcrInput input = new OcrInput())
{
    // Define pages to process (0-based indexing)
    int[] pageIndices = new int[] { 0, 1 };

    // Load specific TIFF frames
    input.LoadImageFrames("MultiFrame.Tiff", pageIndices);

    // Extract text from all frames
    OcrResult result = ocr.Read(input);

    Console.WriteLine(result.Text);
    Console.WriteLine($"{result.Pages.Count} Pages processed");
}

Imports IronOcr

Private ocr As New IronTesseract()

Using input As New OcrInput()
	' Define pages to process (0-based indexing)
	Dim pageIndices() As Integer = { 0, 1 }

	' Load specific TIFF frames
	input.LoadImageFrames("MultiFrame.Tiff", pageIndices)

	' Extract text from all frames
	Dim result As OcrResult = ocr.Read(input)

	Console.WriteLine(result.Text)
	Console.WriteLine($"{result.Pages.Count} Pages processed")
End Using

$vbLabelText $csharpLabel

TIFF 또는 PDF 파일을 검색 가능한 형식으로 변환합니다.

using System;
using IronOcr;

IronTesseract ocr = new IronTesseract();

using (OcrInput input = new OcrInput())
{
    // Set document metadata
    input.Title = "Quarterly Report";

    // Combine multiple sources
    input.LoadImage("image1.jpeg");
    input.LoadImage("image2.png");

    // Add specific frames from animated images
    int[] gifFrames = new int[] { 1, 2 };
    input.LoadImageFrames("image3.gif", gifFrames);

    // Create searchable PDF
    OcrResult result = ocr.Read(input);

    // Pass true to apply any active OcrInput filters to the searchable PDF output (added v2025.5.11)
    result.SaveAsSearchablePdf("searchable.pdf", true);
}

using System;
using IronOcr;

IronTesseract ocr = new IronTesseract();

using (OcrInput input = new OcrInput())
{
    // Set document metadata
    input.Title = "Quarterly Report";

    // Combine multiple sources
    input.LoadImage("image1.jpeg");
    input.LoadImage("image2.png");

    // Add specific frames from animated images
    int[] gifFrames = new int[] { 1, 2 };
    input.LoadImageFrames("image3.gif", gifFrames);

    // Create searchable PDF
    OcrResult result = ocr.Read(input);

    // Pass true to apply any active OcrInput filters to the searchable PDF output (added v2025.5.11)
    result.SaveAsSearchablePdf("searchable.pdf", true);
}

Imports System
Imports IronOcr

Dim ocr As New IronTesseract()

Using input As New OcrInput()
    ' Set document metadata
    input.Title = "Quarterly Report"

    ' Combine multiple sources
    input.LoadImage("image1.jpeg")
    input.LoadImage("image2.png")

    ' Add specific frames from animated images
    Dim gifFrames As Integer() = {1, 2}
    input.LoadImageFrames("image3.gif", gifFrames)

    ' Create searchable PDF
    Dim result As OcrResult = ocr.Read(input)

    ' Pass true to apply any active OcrInput filters to the searchable PDF output (added v2025.5.11)
    result.SaveAsSearchablePdf("searchable.pdf", True)
End Using

$vbLabelText $csharpLabel

기존 PDF 파일을 검색 가능한 버전으로 변환하세요:

using IronOcr;

var ocr = new IronTesseract();

using (var input = new OcrInput())
{
    // Set PDF metadata
    input.Title = "Annual Report 2024";

    // Process existing PDF
    input.LoadPdf("example.pdf", "password");

    // Generate searchable version
    var result = ocr.Read(input);
    result.SaveAsSearchablePdf("searchable.pdf");
}

using IronOcr;

var ocr = new IronTesseract();

using (var input = new OcrInput())
{
    // Set PDF metadata
    input.Title = "Annual Report 2024";

    // Process existing PDF
    input.LoadPdf("example.pdf", "password");

    // Generate searchable version
    var result = ocr.Read(input);
    result.SaveAsSearchablePdf("searchable.pdf");
}

Imports IronOcr

Private ocr = New IronTesseract()

Using input = New OcrInput()
	' Set PDF metadata
	input.Title = "Annual Report 2024"

	' Process existing PDF
	input.LoadPdf("example.pdf", "password")

	' Generate searchable version
	Dim result = ocr.Read(input)
	result.SaveAsSearchablePdf("searchable.pdf")
End Using

$vbLabelText $csharpLabel

TIFF 변환에도 동일한 기법을 적용하세요.

using IronOcr;

var ocr = new IronTesseract();

using (var input = new OcrInput())
{
    // Configure document properties
    input.Title = "Scanned Archive Document";

    // Select pages to process
    var pageIndices = new int[] { 1, 2 };
    input.LoadImageFrames("example.tiff", pageIndices);

    // Create searchable PDF from TIFF
    OcrResult result = ocr.Read(input);
    result.SaveAsSearchablePdf("searchable.pdf");
}

using IronOcr;

var ocr = new IronTesseract();

using (var input = new OcrInput())
{
    // Configure document properties
    input.Title = "Scanned Archive Document";

    // Select pages to process
    var pageIndices = new int[] { 1, 2 };
    input.LoadImageFrames("example.tiff", pageIndices);

    // Create searchable PDF from TIFF
    OcrResult result = ocr.Read(input);
    result.SaveAsSearchablePdf("searchable.pdf");
}

Imports IronOcr

Private ocr = New IronTesseract()

Using input = New OcrInput()
	' Configure document properties
	input.Title = "Scanned Archive Document"

	' Select pages to process
	Dim pageIndices = New Integer() { 1, 2 }
	input.LoadImageFrames("example.tiff", pageIndices)

	' Create searchable PDF from TIFF
	Dim result As OcrResult = ocr.Read(input)
	result.SaveAsSearchablePdf("searchable.pdf")
End Using

$vbLabelText $csharpLabel

OCR 결과를 HOCR HTML 형식으로 내보내는 방법은 무엇인가요?

IronOCR HOCR HTML 내보내기를 지원하여 레이아웃 정보를 유지하면서 구조화된 PDF를 HTML로 , TIFF를 HTML로 변환할 수 있습니다.

using IronOcr;

var ocr = new IronTesseract();

using (var input = new OcrInput())
{
    // Set HTML title
    input.Title = "Document Archive";

    // Process multiple document types
    input.LoadImage("image2.jpeg");
    input.LoadPdf("example.pdf", "password");

    // Add TIFF pages
    var pageIndices = new int[] { 1, 2 };
    input.LoadImageFrames("example.tiff", pageIndices);

    // Export as HOCR with position data
    OcrResult result = ocr.Read(input);
    result.SaveAsHocrFile("hocr.html");
}

using IronOcr;

var ocr = new IronTesseract();

using (var input = new OcrInput())
{
    // Set HTML title
    input.Title = "Document Archive";

    // Process multiple document types
    input.LoadImage("image2.jpeg");
    input.LoadPdf("example.pdf", "password");

    // Add TIFF pages
    var pageIndices = new int[] { 1, 2 };
    input.LoadImageFrames("example.tiff", pageIndices);

    // Export as HOCR with position data
    OcrResult result = ocr.Read(input);
    result.SaveAsHocrFile("hocr.html");
}

Imports IronOcr

Dim ocr As New IronTesseract()

Using input As New OcrInput()
    ' Set HTML title
    input.Title = "Document Archive"

    ' Process multiple document types
    input.LoadImage("image2.jpeg")
    input.LoadPdf("example.pdf", "password")

    ' Add TIFF pages
    Dim pageIndices As Integer() = {1, 2}
    input.LoadImageFrames("example.tiff", pageIndices)

    ' Export as HOCR with position data
    Dim result As OcrResult = ocr.Read(input)
    result.SaveAsHocrFile("hocr.html")
End Using

$vbLabelText $csharpLabel

IronOCR 텍스트와 함께 바코드도 읽을 수 있습니까?

IronOCR 텍스트 인식과 바코드 판독 기능을 독창적으로 결합하여 별도의 라이브러리가 필요 없도록 합니다.

// Enable combined text and barcode recognition
using IronOcr;

var ocr = new IronTesseract();

// Enable barcode detection
ocr.Configuration.ReadBarCodes = true;

using (var input = new OcrInput())
{
    // Load image containing both text and barcodes
    input.LoadImage("img/Barcode.png");

    // Process both text and barcodes
    var result = ocr.Read(input);

    // Extract barcode data
    foreach (var barcode in result.Barcodes)
    {
        Console.WriteLine($"Barcode Value: {barcode.Value}");
        Console.WriteLine($"Type: {barcode.Type}, Location: {barcode.Location}");
    }
}

// Enable combined text and barcode recognition
using IronOcr;

var ocr = new IronTesseract();

// Enable barcode detection
ocr.Configuration.ReadBarCodes = true;

using (var input = new OcrInput())
{
    // Load image containing both text and barcodes
    input.LoadImage("img/Barcode.png");

    // Process both text and barcodes
    var result = ocr.Read(input);

    // Extract barcode data
    foreach (var barcode in result.Barcodes)
    {
        Console.WriteLine($"Barcode Value: {barcode.Value}");
        Console.WriteLine($"Type: {barcode.Type}, Location: {barcode.Location}");
    }
}

Imports IronOcr

Dim ocr As New IronTesseract()

' Enable barcode detection
ocr.Configuration.ReadBarCodes = True

Using input As New OcrInput()
    ' Load image containing both text and barcodes
    input.LoadImage("img/Barcode.png")

    ' Process both text and barcodes
    Dim result = ocr.Read(input)

    ' Extract barcode data
    For Each barcode In result.Barcodes
        Console.WriteLine($"Barcode Value: {barcode.Value}")
        Console.WriteLine($"Type: {barcode.Type}, Location: {barcode.Location}")
    Next
End Using

$vbLabelText $csharpLabel

상세 OCR 결과 및 메타데이터에 액세스하는 방법은 무엇입니까?

IronOCR 결과 객체는 고급 개발자가 정교한 애플리케이션을 개발하는 데 활용할 수 있는 포괄적인 데이터를 제공합니다.

각 OcrResult에는 페이지, 단락, 줄, WORD, 문자로 구성된 계층적 컬렉션이 포함되어 있습니다. 모든 요소에는 위치, 글꼴 정보, 신뢰도 점수와 같은 자세한 메타데이터가 포함되어 있습니다.

개별 요소(단락, 단어, 바코드)는 추가 처리를 위해 이미지 또는 비트맵으로 내보낼 수 있습니다.

using System;
using IronOcr;
using IronSoftware.Drawing;

// Configure with barcode support
IronTesseract ocr = new IronTesseract
{
    Configuration = { ReadBarCodes = true }
};

using OcrInput input = new OcrInput();

// Process multi-page document
int[] pageIndices = { 1, 2 };
input.LoadImageFrames(@"img\Potter.tiff", pageIndices);

OcrResult result = ocr.Read(input);

// Navigate the complete results hierarchy
foreach (var page in result.Pages)
{
    // Page-level data
    int pageNumber = page.PageNumber;
    string pageText = page.Text;
    int pageWordCount = page.WordCount;

    // Extract page elements
    OcrResult.Barcode[] barcodes = page.Barcodes;
    AnyBitmap pageImage = page.ToBitmap();
    double pageWidth = page.Width;
    double pageHeight = page.Height;

    foreach (var paragraph in page.Paragraphs)
    {
        // Paragraph properties
        int paragraphNumber = paragraph.ParagraphNumber;
        string paragraphText = paragraph.Text;
        double paragraphConfidence = paragraph.Confidence;
        var textDirection = paragraph.TextDirection;

        foreach (var line in paragraph.Lines)
        {
            // Line details including baseline information
            string lineText = line.Text;
            double lineConfidence = line.Confidence;
            double baselineAngle = line.BaselineAngle;
            double baselineOffset = line.BaselineOffset;

            foreach (var word in line.Words)
            {
                // Word-level data
                string wordText = word.Text;
                double wordConfidence = word.Confidence;

                // Font information (when available)
                if (word.Font != null)
                {
                    string fontName = word.Font.FontName;
                    double fontSize = word.Font.FontSize;
                    bool isBold = word.Font.IsBold;
                    bool isItalic = word.Font.IsItalic;
                }

                foreach (var character in word.Characters)
                {
                    // Character-level analysis
                    string charText = character.Text;
                    double charConfidence = character.Confidence;

                    // Alternative character choices for spell-checking
                    OcrResult.Choice[] alternatives = character.Choices;
                }
            }
        }
    }
}

using System;
using IronOcr;
using IronSoftware.Drawing;

// Configure with barcode support
IronTesseract ocr = new IronTesseract
{
    Configuration = { ReadBarCodes = true }
};

using OcrInput input = new OcrInput();

// Process multi-page document
int[] pageIndices = { 1, 2 };
input.LoadImageFrames(@"img\Potter.tiff", pageIndices);

OcrResult result = ocr.Read(input);

// Navigate the complete results hierarchy
foreach (var page in result.Pages)
{
    // Page-level data
    int pageNumber = page.PageNumber;
    string pageText = page.Text;
    int pageWordCount = page.WordCount;

    // Extract page elements
    OcrResult.Barcode[] barcodes = page.Barcodes;
    AnyBitmap pageImage = page.ToBitmap();
    double pageWidth = page.Width;
    double pageHeight = page.Height;

    foreach (var paragraph in page.Paragraphs)
    {
        // Paragraph properties
        int paragraphNumber = paragraph.ParagraphNumber;
        string paragraphText = paragraph.Text;
        double paragraphConfidence = paragraph.Confidence;
        var textDirection = paragraph.TextDirection;

        foreach (var line in paragraph.Lines)
        {
            // Line details including baseline information
            string lineText = line.Text;
            double lineConfidence = line.Confidence;
            double baselineAngle = line.BaselineAngle;
            double baselineOffset = line.BaselineOffset;

            foreach (var word in line.Words)
            {
                // Word-level data
                string wordText = word.Text;
                double wordConfidence = word.Confidence;

                // Font information (when available)
                if (word.Font != null)
                {
                    string fontName = word.Font.FontName;
                    double fontSize = word.Font.FontSize;
                    bool isBold = word.Font.IsBold;
                    bool isItalic = word.Font.IsItalic;
                }

                foreach (var character in word.Characters)
                {
                    // Character-level analysis
                    string charText = character.Text;
                    double charConfidence = character.Confidence;

                    // Alternative character choices for spell-checking
                    OcrResult.Choice[] alternatives = character.Choices;
                }
            }
        }
    }
}

Imports System
Imports IronOcr
Imports IronSoftware.Drawing

' Configure with barcode support
Private ocr As New IronTesseract With {
	.Configuration = { ReadBarCodes = True }
}

Private OcrInput As using

' Process multi-page document
Private pageIndices() As Integer = { 1, 2 }
input.LoadImageFrames("img\Potter.tiff", pageIndices)

Dim result As OcrResult = ocr.Read(input)

' Navigate the complete results hierarchy
For Each page In result.Pages
	' Page-level data
	Dim pageNumber As Integer = page.PageNumber
	Dim pageText As String = page.Text
	Dim pageWordCount As Integer = page.WordCount

	' Extract page elements
	Dim barcodes() As OcrResult.Barcode = page.Barcodes
	Dim pageImage As AnyBitmap = page.ToBitmap()
	Dim pageWidth As Double = page.Width
	Dim pageHeight As Double = page.Height

	For Each paragraph In page.Paragraphs
		' Paragraph properties
		Dim paragraphNumber As Integer = paragraph.ParagraphNumber
		Dim paragraphText As String = paragraph.Text
		Dim paragraphConfidence As Double = paragraph.Confidence
		Dim textDirection = paragraph.TextDirection

		For Each line In paragraph.Lines
			' Line details including baseline information
			Dim lineText As String = line.Text
			Dim lineConfidence As Double = line.Confidence
			Dim baselineAngle As Double = line.BaselineAngle
			Dim baselineOffset As Double = line.BaselineOffset

			For Each word In line.Words
				' Word-level data
				Dim wordText As String = word.Text
				Dim wordConfidence As Double = word.Confidence

				' Font information (when available)
				If word.Font IsNot Nothing Then
					Dim fontName As String = word.Font.FontName
					Dim fontSize As Double = word.Font.FontSize
					Dim isBold As Boolean = word.Font.IsBold
					Dim isItalic As Boolean = word.Font.IsItalic
				End If

				For Each character In word.Characters
					' Character-level analysis
					Dim charText As String = character.Text
					Dim charConfidence As Double = character.Confidence

					' Alternative character choices for spell-checking
					Dim alternatives() As OcrResult.Choice = character.Choices
				Next character
			Next word
		Next line
	Next paragraph
Next page

$vbLabelText $csharpLabel

요약

IronOCR C# 개발자에게 Windows, Linux 및 Mac 플랫폼에서 원활하게 실행되는 가장 진보된 Tesseract API 구현을 제공합니다. IronOCR 사용하면 불완전한 문서에서도 이미지에서 텍스트를 정확하게 읽어낼 수 있다는 점에서 기본적인 OCR 솔루션과 차별화됩니다.

이 라이브러리의 고유한 특징으로는 통합 바코드 판독 기능과 검색 가능한 PDF 또는 HOCR HTML 형식으로 결과를 내보낼 수 있는 기능이 있으며, 이러한 기능은 표준 Tesseract 구현에서는 사용할 수 없습니다.

앞으로 나아가기

IronOCR 실력을 계속 향상시키려면:

소스 코드 다운로드

C# OCR 이미지-텍스트 변환 기능을 애플리케이션에 구현할 준비가 되셨나요? IronOCR 다운로드 하고 지금 바로 무료 체험을 시작하세요.

자주 묻는 질문

Tesseract를 사용하지 않고 C#에서 이미지를 텍스트로 변환하는 방법은 무엇인가요?

IronOCR을 사용하면 Tesseract 없이도 C#에서 이미지를 텍스트로 변환할 수 있습니다. IronOCR은 이미지-텍스트 변환을 직접 처리하는 내장 메서드를 통해 변환 과정을 간소화합니다.

저화질 이미지에서 OCR 정확도를 높이려면 어떻게 해야 할까요?

IronOCR은 Input.Deskew() 및 Input.DeNoise() 와 같은 이미지 필터를 제공하여 이미지의 기울기를 보정하고 노이즈를 줄여 OCR 정확도를 크게 향상시킬 수 있습니다.

C#에서 OCR을 사용하여 여러 페이지로 구성된 문서에서 텍스트를 추출하는 단계는 무엇입니까?

IronOCR은 여러 페이지로 구성된 문서에서 텍스트를 추출하기 위해 PDF의 경우 LoadPdf() 와 같은 메서드를 사용하거나 TIFF 파일을 처리하여 각 페이지를 텍스트로 변환할 수 있도록 지원합니다.

이미지에서 바코드와 텍스트를 동시에 읽는 것이 가능할까요?

네, IronOCR은 단일 이미지에서 텍스트와 바코드를 모두 읽을 수 있습니다. ocr.Configuration.ReadBarCodes = true 로 설정하면 바코드 읽기를 활성화하여 텍스트와 바코드 데이터를 모두 추출할 수 있습니다.

여러 언어로 된 문서를 처리하도록 OCR을 어떻게 설정할 수 있나요?

IronOCR은 125개 이상의 언어를 지원하며, ocr.Language 사용하여 기본 언어를 설정하고 ocr.AddSecondaryLanguage() 를 사용하여 추가 언어를 추가하여 다국어 문서를 처리할 수 있습니다.

OCR 결과를 다양한 형식으로 내보낼 수 있는 방법에는 어떤 것들이 있습니까?

IronOCR은 PDF 형식의 SaveAsSearchablePdf() , 일반 텍스트 형식의 SaveAsTextFile() , HOCR HTML 형식의 SaveAsHocrFile() 등 OCR 결과를 내보내는 여러 가지 방법을 제공합니다.

대용량 이미지 파일의 OCR 처리 속도를 최적화하려면 어떻게 해야 할까요?

OCR 처리 속도를 최적화하려면 IronOCR의 OcrLanguage.EnglishFast 사용하여 언어 인식 속도를 높이고 System.Drawing.Rectangle 사용하여 OCR을 수행할 특정 영역을 정의하여 처리 시간을 단축하십시오.

보호된 PDF 파일의 OCR 처리는 어떻게 해야 하나요?

보호된 PDF 파일을 처리할 때는 LoadPdf() 메서드와 올바른 암호를 사용하십시오. IronOCR은 이미지 기반 PDF 파일의 페이지를 자동으로 이미지로 변환하여 OCR 처리를 수행합니다.

OCR 결과가 정확하지 않으면 어떻게 해야 하나요?

OCR 결과가 정확하지 않은 경우 IronOCR의 이미지 개선 기능(예: Input.Deskew() 및 Input.DeNoise() 을 사용하고 올바른 언어 팩이 설치되어 있는지 확인하십시오.

OCR 프로세스에서 특정 문자를 제외하도록 사용자 지정할 수 있습니까?

네, IronOCR은 BlackListCharacters 속성을 사용하여 특정 문자를 제외함으로써 OCR 프로세스를 사용자 지정할 수 있습니다. 이를 통해 관련 텍스트에만 집중하여 정확도와 처리 속도를 향상시킬 수 있습니다.

제이콥 멜러

지금 바로 엔지니어링 팀과 채팅하세요

최고기술책임자

제이콥 멜러는 Iron Software의 최고 기술 책임자(CTO)이자 C# PDF 기술을 개척한 선구적인 엔지니어입니다. Iron Software의 핵심 코드베이스를 최초로 개발한 그는 창립 초기부터 회사의 제품 아키텍처를 설계해 왔으며, CEO인 캐머런 리밍턴과 함께 회사를 NASA, 테슬라, 그리고 전 세계 정부 기관에 서비스를 제공하는 50명 이상의 직원을 보유한 기업으로 성장시켰습니다.

제이콥은 맨체스터 대학교에서 토목공학 학사 학위(BEng)를 최우등으로 취득했습니다(1998~2001). 1999년 런던에서 첫 소프트웨어 회사를 설립하고 2005년 첫 .NET 컴포넌트를 개발한 후, 마이크로소프트 생태계 전반에 걸쳐 복잡한 문제를 해결하는 데 전문성을 발휘해 왔습니다.

그의 대표 제품인 IronPDF 및 Iron Suite .NET 라이브러리는 전 세계적으로 3천만 건 이상의 NuGet 설치 수를 기록했으며, 그의 핵심 코드는 전 세계 개발자들이 사용하는 다양한 도구에 지속적으로 활용되고 있습니다. 25년의 실무 경험과 41년의 코딩 전문성을 바탕으로, 제이콥은 차세대 기술 리더들을 양성하는 동시에 기업 수준의 C#, Java, Python PDF 기술 혁신을 주도하는 데 주력하고 있습니다.

제프리 T. 프리츠

.NET 커뮤니티 팀의 수석 프로그램 관리자

제프는 .NET 및 Visual Studio 팀의 수석 프로그램 관리자이기도 합니다. 그는 .NET Conf 가상 컨퍼런스 시리즈의 총괄 프로듀서이며, 개발자를 위한 라이브 스트림 'Fritz and Friends'를 주 2회 진행하며 시청자들과 함께 기술에 대해 이야기하고 코드를 작성합니다. 제프는 Microsoft Build, Microsoft Ignite, .NET Conf, Microsoft MVP Summit 등 주요 Microsoft 개발자 행사를 위한 워크숍, 프레젠테이션 및 콘텐츠 기획을 담당합니다.

시작할 준비 되셨나요?

Nuget 다운로드 5,570,591 | 버전: 2026.4 방금 출시되었습니다

라이선스 보기

아직도 스크롤하고 계신가요?

빠른 증거를 원하시나요? PM > Install-Package IronOcr
샘플을 실행하세요 이미지가 검색 가능한 텍스트로 바뀌는 것을 확인해 보세요.

라이선스 보기

30일 무료 체험 시작하기

이 페이지에서

C# OCR Image to Text Tutorial: Convert Images to Text Without Tesseract

NuGet 패키지 관리자를 사용하여 https://www.nuget.org/packages/IronOcr 설치하기

다음 코드 조각을 복사하여 실행하세요.

실제 운영 환경에서 테스트할 수 있도록 배포하세요.

최소 워크플로우(5단계)

.NET 애플리케이션에서 이미지에서 텍스트를 읽는 방법은 무엇인가요?

Tesseract 없이 C# OCR을 구현하기 위해 IronOCR 선택해야 하는 이유는 무엇일까요?

IronOCR C# 튜토리얼을 통해 기본적인 OCR 사용법을 알아보세요.

Tesseract 설정 없이 고급 C# OCR을 구현하는 방법은 무엇인가요?

OcrInput 클래스 특징

IronTesseract급 함선 특징

OcrInput과 IronTesseract를 시작하는 방법은 무엇인가요?

IronOCR 저품질 스캔 파일을 어떻게 처리하나요?

OCR 성능과 속도를 최적화하는 방법은 무엇일까요?

어떤 이미지 필터가 OCR 속도를 향상시키나요?

IronOCR 속도를 최대화하려면 어떻게 설정해야 할까요?

C# OCR을 사용하여 이미지의 특정 영역을 읽는 방법은 무엇입니까?

IronOCR 잘라낸 영역을 처리하여 더 빠른 결과를 얻을 수 있습니까?

IronOCR 몇 개의 언어를 지원하나요?

다국어 OCR 구현 방법은 무엇인가요?

IronOCR 다국어 문서를 처리할 수 있습니까?

C# OCR을 사용하여 여러 페이지로 구성된 문서를 처리하는 방법은 무엇입니까?

OCR 결과를 HOCR HTML 형식으로 내보내는 방법은 무엇인가요?

IronOCR 텍스트와 함께 바코드도 읽을 수 있습니까?

상세 OCR 결과 및 메타데이터에 액세스하는 방법은 무엇입니까?

요약

앞으로 나아가기

소스 코드 다운로드

자주 묻는 질문

Tesseract를 사용하지 않고 C#에서 이미지를 텍스트로 변환하는 방법은 무엇인가요?

저화질 이미지에서 OCR 정확도를 높이려면 어떻게 해야 할까요?

C#에서 OCR을 사용하여 여러 페이지로 구성된 문서에서 텍스트를 추출하는 단계는 무엇입니까?

이미지에서 바코드와 텍스트를 동시에 읽는 것이 가능할까요?

여러 언어로 된 문서를 처리하도록 OCR을 어떻게 설정할 수 있나요?

OCR 결과를 다양한 형식으로 내보낼 수 있는 방법에는 어떤 것들이 있습니까?

대용량 이미지 파일의 OCR 처리 속도를 최적화하려면 어떻게 해야 할까요?

보호된 PDF 파일의 OCR 처리는 어떻게 해야 하나요?

OCR 결과가 정확하지 않으면 어떻게 해야 하나요?

OCR 프로세스에서 특정 문자를 제외하도록 사용자 지정할 수 있습니까?

아직도 스크롤하고 계신가요?

무료로 받기

다음 단계: 30일 무료 체험 시작하기

다음 단계: 30일 무료 체험 시작하기

전 세계 수백만 엔지니어들이 신뢰하는 제품입니다.

아이언 서포트 팀