OCR Indonesia dalam C#dan .NET
Versi lain dari dokumen ini:
IronOCR adalah komponen perangkat lunak C# yang memungkinkan pembuat kode .NET membaca teks dari gambar dan dokumen PDF dalam 126 bahasa, termasuk bahasa Indonesia.
Ini adalah cabang lanjutan dari Tesseract, dibuat secara eksklusif untuk para pengembang .NET dan secara teratur mengungguli mesin Tesseract lainnya dalam hal kecepatan dan akurasi.
Isi IronOcr.Languages.Indonesian
Paket ini berisi 55 bahasa OCR untuk .NET:
- Bahasa Indonesia
- IndonesianBest
- IndonesianFast
Unduh
Paket Bahasa Indonesia [Bahasa Indonesia]
Instalasi
Hal pertama yang harus kita lakukan adalah menginstal paket OCR Indonesia kita ke proyek .NET Anda.
Install-Package IronOCR.Languages.Indonesian
Contoh Kode
Contoh kode C# ini membaca teks bahasa Indonesia dari dokumen Gambar atau PDF.
// Ensure the IronOcr library is installed
using IronOcr;
// Create a new instance of IronTesseract
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Indonesian; // Set the language to Indonesian
// Use 'using' keyword to ensure resources are cleaned up
using (var Input = new OcrInput(@"images\Indonesian.png"))
{
// Perform OCR on the input image
var Result = Ocr.Read(Input);
// Store the result text
var AllText = Result.Text;
}
// Ensure the IronOcr library is installed
using IronOcr;
// Create a new instance of IronTesseract
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Indonesian; // Set the language to Indonesian
// Use 'using' keyword to ensure resources are cleaned up
using (var Input = new OcrInput(@"images\Indonesian.png"))
{
// Perform OCR on the input image
var Result = Ocr.Read(Input);
// Store the result text
var AllText = Result.Text;
}
' Ensure the IronOcr library is installed
Imports IronOcr
' Create a new instance of IronTesseract
Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Indonesian ' Set the language to Indonesian
' Use 'using' keyword to ensure resources are cleaned up
Using Input = New OcrInput("images\Indonesian.png")
' Perform OCR on the input image
Dim Result = Ocr.Read(Input)
' Store the result text
Dim AllText = Result.Text
End Using
Mengapa Memilih IronOCR?
IronOCR adalah pustaka perangkat lunak .NET yang mudah dipasang, lengkap, dan terdokumentasi dengan baik.
Pilih IronOCR untuk mencapai akurasi 99,8% + OCR tanpa menggunakan layanan web eksternal, biaya berkelanjutan, atau mengirim dokumen rahasia melalui internet.
Mengapa pengembang C# memilih IronOCR daripada Vanilla Tesseract:
- Pasang sebagai DLL atau NuGet tunggal
- Termasuk untuk Tesseract 5, 4 dan 3 Engine di luar kotak.
- Akurasi 99,8% secara signifikan mengungguli Tesseract biasa.
- Kecepatan Tinggi dan MultiThreading
- MVC, WebApp, Desktop, Konsol & Aplikasi Server kompatibel
- Tidak ada kode Exes atau C ++ untuk digunakan
- Dukungan PDF OCR penuh
- Untuk melakukan OCR hampir semua file Gambar atau PDF
- Dukungan penuh .NET Core, Standard dan FrameWork
- Terapkan di Windows, Mac, Linux, Azure, Docker, Lambda, AWS
- Baca kode batang dan kode QR
- Ekspor OCR sebagai XHTML
- Ekspor OCR ke dokumen PDF yang dapat dicari
- Dukungan multithreading
- 126 bahasa internasional semuanya dikelola melalui file NuGet atau OcrData
- Ekstrak Gambar, Koordinat, Statistik, dan Font. Bukan hanya teks.
- Dapat digunakan untuk mendistribusikan ulang Tesseract OCR di dalam aplikasi komersial & eksklusif.
IronOCR bersinar saat bekerja dengan gambar dunia nyata dan dokumen yang tidak sempurna seperti foto, atau pindaian resolusi rendah yang mungkin memiliki gangguan atau ketidaksempurnaan digital.
Pustaka OCR gratis lainnya untuk platform .NET seperti API .net tesseract dan layanan web lainnya tidak bekerja dengan baik pada kasus penggunaan dunia nyata ini.
OCR dengan Tesseract 5 - Mulai Coding di C#
Contoh kode di bawah ini menunjukkan betapa mudahnya membaca teks dari gambar menggunakan C# atau VB .NET.
OneLiner
// Perform a quick OCR on the screenshot and output the text
string Text = new IronTesseract().Read(@"img\Screenshot.png").Text;
// Perform a quick OCR on the screenshot and output the text
string Text = new IronTesseract().Read(@"img\Screenshot.png").Text;
' Perform a quick OCR on the screenshot and output the text
Dim Text As String = (New IronTesseract()).Read("img\Screenshot.png").Text
Hello World yang dapat dikonfigurasi
// Ensure the IronOcr library is installed
using IronOcr;
// Create a new instance of IronTesseract
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Indonesian; // Set the language to Indonesian
using (var Input = new OcrInput())
{
// Add image to OCR input
Input.AddImage("images/sample.jpeg");
// Perform OCR on the input
var Result = Ocr.Read(Input);
// Output the result text
Console.WriteLine(Result.Text);
}
// Ensure the IronOcr library is installed
using IronOcr;
// Create a new instance of IronTesseract
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Indonesian; // Set the language to Indonesian
using (var Input = new OcrInput())
{
// Add image to OCR input
Input.AddImage("images/sample.jpeg");
// Perform OCR on the input
var Result = Ocr.Read(Input);
// Output the result text
Console.WriteLine(Result.Text);
}
' Ensure the IronOcr library is installed
Imports IronOcr
' Create a new instance of IronTesseract
Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Indonesian ' Set the language to Indonesian
Using Input = New OcrInput()
' Add image to OCR input
Input.AddImage("images/sample.jpeg")
' Perform OCR on the input
Dim Result = Ocr.Read(Input)
' Output the result text
Console.WriteLine(Result.Text)
End Using
C# PDF OCR
Pendekatan yang sama juga dapat digunakan untuk mengekstrak teks dari dokumen PDF apa pun.
using IronOcr;
// Create a new instance of IronTesseract
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Indonesian; // Set the language to Indonesian
using (var input = new OcrInput())
{
input.AddPdf("example.pdf", "password"); // Add a protected PDF to OCR Input
// Perform OCR
var Result = Ocr.Read(input);
// Output the result text
Console.WriteLine(Result.Text);
Console.WriteLine($"{Result.Pages.Count()} Pages"); // Display the number of pages in the PDF
}
using IronOcr;
// Create a new instance of IronTesseract
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Indonesian; // Set the language to Indonesian
using (var input = new OcrInput())
{
input.AddPdf("example.pdf", "password"); // Add a protected PDF to OCR Input
// Perform OCR
var Result = Ocr.Read(input);
// Output the result text
Console.WriteLine(Result.Text);
Console.WriteLine($"{Result.Pages.Count()} Pages"); // Display the number of pages in the PDF
}
Imports IronOcr
' Create a new instance of IronTesseract
Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Indonesian ' Set the language to Indonesian
Using input = New OcrInput()
input.AddPdf("example.pdf", "password") ' Add a protected PDF to OCR Input
' Perform OCR
Dim Result = Ocr.Read(input)
' Output the result text
Console.WriteLine(Result.Text)
Console.WriteLine($"{Result.Pages.Count()} Pages") ' Display the number of pages in the PDF
End Using
OCR untuk TIFF MultiPage
OCR Membaca format file TIFF termasuk dokumen beberapa halaman. TIFF juga dapat diubah langsung menjadi file PDF dengan teks yang dapat dicari.
using IronOcr;
// Create a new instance of IronTesseract
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Indonesian; // Set the language to Indonesian
using (var Input = new OcrInput())
{
Input.AddMultiFrameTiff("multi-frame.tiff"); // Add a multi-frame TIFF image
var Result = Ocr.Read(Input);
// Output the result text
Console.WriteLine(Result.Text);
}
using IronOcr;
// Create a new instance of IronTesseract
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Indonesian; // Set the language to Indonesian
using (var Input = new OcrInput())
{
Input.AddMultiFrameTiff("multi-frame.tiff"); // Add a multi-frame TIFF image
var Result = Ocr.Read(Input);
// Output the result text
Console.WriteLine(Result.Text);
}
Imports IronOcr
' Create a new instance of IronTesseract
Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Indonesian ' Set the language to Indonesian
Using Input = New OcrInput()
Input.AddMultiFrameTiff("multi-frame.tiff") ' Add a multi-frame TIFF image
Dim Result = Ocr.Read(Input)
' Output the result text
Console.WriteLine(Result.Text)
End Using
Kode batang dan QR
Fitur unik IronOCR adalah dapat membaca kode batang dan kode QR dari dokumen saat memindai teks. Contoh dari Kelas OcrResult.OcrBarcode
memberikan informasi rinci kepada pengembang tentang setiap kode batang yang dipindai.
using IronOcr;
// Create a new instance of IronTesseract
var Ocr = new IronTesseract();
Ocr.Configuration.ReadBarCodes = true; // Enable barcode reading
using (var input = new OcrInput())
{
input.AddImage("img/Barcode.png"); // Add image containing barcode
var Result = Ocr.Read(input);
foreach (var Barcode in Result.Barcodes)
{
Console.WriteLine(Barcode.Value); // Output the value of each barcode
// Barcode type and location properties are also exposed
}
}
using IronOcr;
// Create a new instance of IronTesseract
var Ocr = new IronTesseract();
Ocr.Configuration.ReadBarCodes = true; // Enable barcode reading
using (var input = new OcrInput())
{
input.AddImage("img/Barcode.png"); // Add image containing barcode
var Result = Ocr.Read(input);
foreach (var Barcode in Result.Barcodes)
{
Console.WriteLine(Barcode.Value); // Output the value of each barcode
// Barcode type and location properties are also exposed
}
}
Imports IronOcr
' Create a new instance of IronTesseract
Private Ocr = New IronTesseract()
Ocr.Configuration.ReadBarCodes = True ' Enable barcode reading
Using input = New OcrInput()
input.AddImage("img/Barcode.png") ' Add image containing barcode
Dim Result = Ocr.Read(input)
For Each Barcode In Result.Barcodes
Console.WriteLine(Barcode.Value) ' Output the value of each barcode
' Barcode type and location properties are also exposed
Next Barcode
End Using
OCR pada Area Gambar Tertentu
Semua metode pemindaian dan pembacaan IronOCR menyediakan kemampuan untuk menentukan dengan tepat bagian mana dari suatu halaman atau halaman yang teksnya ingin kita baca. Ini sangat berguna saat kita melihat formulir standar dan dapat menghemat banyak waktu serta meningkatkan efisiensi.
Untuk menggunakan crop region, kita perlu menambahkan referensi sistem ke System.Drawing
sehingga kita bisa menggunakan objek System.Drawing.Rectangle
.
using IronOcr;
using System.Drawing;
// Create a new instance of IronTesseract
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Indonesian; // Set the language to Indonesian
using (var Input = new OcrInput())
{
var ContentArea = new System.Drawing.Rectangle() { X = 215, Y = 1250, Height = 280, Width = 1335 };
// Dimensions specified in pixels
Input.Add("document.png", ContentArea); // Add cropped area of the image
var Result = Ocr.Read(Input); // Perform OCR
Console.WriteLine(Result.Text); // Output the result text
}
using IronOcr;
using System.Drawing;
// Create a new instance of IronTesseract
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Indonesian; // Set the language to Indonesian
using (var Input = new OcrInput())
{
var ContentArea = new System.Drawing.Rectangle() { X = 215, Y = 1250, Height = 280, Width = 1335 };
// Dimensions specified in pixels
Input.Add("document.png", ContentArea); // Add cropped area of the image
var Result = Ocr.Read(Input); // Perform OCR
Console.WriteLine(Result.Text); // Output the result text
}
Imports IronOcr
Imports System.Drawing
' Create a new instance of IronTesseract
Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Indonesian ' Set the language to Indonesian
Using Input = New OcrInput()
Dim ContentArea = New System.Drawing.Rectangle() With {
.X = 215,
.Y = 1250,
.Height = 280,
.Width = 1335
}
' Dimensions specified in pixels
Input.Add("document.png", ContentArea) ' Add cropped area of the image
Dim Result = Ocr.Read(Input) ' Perform OCR
Console.WriteLine(Result.Text) ' Output the result text
End Using
OCR untuk Pemindaian Berkualitas Rendah
Kelas IronOCR OcrInput
dapat memperbaiki pemindaian yang tidak dapat dibaca oleh Tesseract normal.
using IronOcr;
// Create a new instance of IronTesseract
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Indonesian; // Set the language to Indonesian
using (var Input = new OcrInput(@"img\Potter.LowQuality.tiff"))
{
Input.DeNoise(); // Fix digital noise and poor scanning
Input.Deskew(); // Correct rotation and perspective
var Result = Ocr.Read(Input); // Perform OCR
Console.WriteLine(Result.Text); // Output the result text
}
using IronOcr;
// Create a new instance of IronTesseract
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Indonesian; // Set the language to Indonesian
using (var Input = new OcrInput(@"img\Potter.LowQuality.tiff"))
{
Input.DeNoise(); // Fix digital noise and poor scanning
Input.Deskew(); // Correct rotation and perspective
var Result = Ocr.Read(Input); // Perform OCR
Console.WriteLine(Result.Text); // Output the result text
}
Imports IronOcr
' Create a new instance of IronTesseract
Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Indonesian ' Set the language to Indonesian
Using Input = New OcrInput("img\Potter.LowQuality.tiff")
Input.DeNoise() ' Fix digital noise and poor scanning
Input.Deskew() ' Correct rotation and perspective
Dim Result = Ocr.Read(Input) ' Perform OCR
Console.WriteLine(Result.Text) ' Output the result text
End Using
Ekspor hasil OCR sebagai PDF yang Dapat Dicari
Gambar ke PDF dengan string teks yang dapat disalin. Dapat diindeks oleh mesin pencari dan database.
using IronOcr;
// Create a new instance of IronTesseract
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Indonesian; // Set the language to Indonesian
using (var Input = new OcrInput())
{
input.Title = "Quarterly Report"; // Set title for the PDF
input.AddImage("image1.jpeg");
input.AddImage("image2.png");
input.AddImage("image3.gif");
var Result = Ocr.Read(input); // Perform OCR
Result.SaveAsSearchablePdf("searchable.pdf"); // Save result as searchable PDF
}
using IronOcr;
// Create a new instance of IronTesseract
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Indonesian; // Set the language to Indonesian
using (var Input = new OcrInput())
{
input.Title = "Quarterly Report"; // Set title for the PDF
input.AddImage("image1.jpeg");
input.AddImage("image2.png");
input.AddImage("image3.gif");
var Result = Ocr.Read(input); // Perform OCR
Result.SaveAsSearchablePdf("searchable.pdf"); // Save result as searchable PDF
}
Imports IronOcr
' Create a new instance of IronTesseract
Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Indonesian ' Set the language to Indonesian
Using Input = New OcrInput()
input.Title = "Quarterly Report" ' Set title for the PDF
input.AddImage("image1.jpeg")
input.AddImage("image2.png")
input.AddImage("image3.gif")
Dim Result = Ocr.Read(input) ' Perform OCR
Result.SaveAsSearchablePdf("searchable.pdf") ' Save result as searchable PDF
End Using
TIFF ke Konversi PDF yang dapat dicari
Ubah dokumen TIFF (atau grup file gambar apa pun) langsung ke PDF yang dapat dicari yang dapat diindeks oleh intranet, situs web, dan mesin pencari Google.
using IronOcr;
// Create a new instance of IronTesseract
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Indonesian; // Set the language to Indonesian
using (var Input = new OcrInput())
{
input.AddMultiFrameTiff("example.tiff"); // Add multi-frame TIFF
var Result = Ocr.Read(input).SaveAsSearchablePdf("searchable.pdf"); // OCR and save as searchable PDF
}
using IronOcr;
// Create a new instance of IronTesseract
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Indonesian; // Set the language to Indonesian
using (var Input = new OcrInput())
{
input.AddMultiFrameTiff("example.tiff"); // Add multi-frame TIFF
var Result = Ocr.Read(input).SaveAsSearchablePdf("searchable.pdf"); // OCR and save as searchable PDF
}
Imports IronOcr
' Create a new instance of IronTesseract
Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Indonesian ' Set the language to Indonesian
Using Input = New OcrInput()
input.AddMultiFrameTiff("example.tiff") ' Add multi-frame TIFF
Dim Result = Ocr.Read(input).SaveAsSearchablePdf("searchable.pdf") ' OCR and save as searchable PDF
End Using
Ekspor hasil OCR sebagai HTML
Konversi Gambar OCR ke XHTML.
using IronOcr;
// Create a new instance of IronTesseract
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Indonesian; // Set the language to Indonesian
using (var Input = new OcrInput())
{
input.Title = "Html Title"; // Set title for XHTML file
input.AddImage("image1.jpeg"); // Add image
var Result = Ocr.Read(input); // Perform OCR
Result.SaveAsHocrFile("results.html"); // Save result as XHTML
}
using IronOcr;
// Create a new instance of IronTesseract
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Indonesian; // Set the language to Indonesian
using (var Input = new OcrInput())
{
input.Title = "Html Title"; // Set title for XHTML file
input.AddImage("image1.jpeg"); // Add image
var Result = Ocr.Read(input); // Perform OCR
Result.SaveAsHocrFile("results.html"); // Save result as XHTML
}
Imports IronOcr
' Create a new instance of IronTesseract
Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Indonesian ' Set the language to Indonesian
Using Input = New OcrInput()
input.Title = "Html Title" ' Set title for XHTML file
input.AddImage("image1.jpeg") ' Add image
Dim Result = Ocr.Read(input) ' Perform OCR
Result.SaveAsHocrFile("results.html") ' Save result as XHTML
End Using
Filter Peningkatan Gambar OCR
IronOCR menyediakan filter unik untuk objek OcrInput
guna meningkatkan kinerja OCR.
Contoh Kode Peningkatan Gambar
Membuat gambar input OCR berkualitas lebih tinggi untuk menghasilkan hasil OCR yang lebih baik dan lebih cepat.
using IronOcr;
// Create a new instance of IronTesseract
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Indonesian; // Set the language to Indonesian
using (var Input = new OcrInput(@"LowQuality.jpeg"))
{
Input.DeNoise(); // Fix digital noise and poor scanning
Input.Deskew(); // Correct rotation and perspective
var Result = Ocr.Read(Input); // Perform OCR
Console.WriteLine(Result.Text); // Output the result text
}
using IronOcr;
// Create a new instance of IronTesseract
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Indonesian; // Set the language to Indonesian
using (var Input = new OcrInput(@"LowQuality.jpeg"))
{
Input.DeNoise(); // Fix digital noise and poor scanning
Input.Deskew(); // Correct rotation and perspective
var Result = Ocr.Read(Input); // Perform OCR
Console.WriteLine(Result.Text); // Output the result text
}
Imports IronOcr
' Create a new instance of IronTesseract
Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Indonesian ' Set the language to Indonesian
Using Input = New OcrInput("LowQuality.jpeg")
Input.DeNoise() ' Fix digital noise and poor scanning
Input.Deskew() ' Correct rotation and perspective
Dim Result = Ocr.Read(Input) ' Perform OCR
Console.WriteLine(Result.Text) ' Output the result text
End Using
Daftar Filter Gambar OCR
Filter input untuk meningkatkan kinerja OCR yang dibangun di IronOCR meliputi:
- OcrInput.Rotate (double degrees) - Rotate image clockwise by specific degrees. Use negative numbers for counterclockwise.
- OcrInput.Binarize () - Converts each pixel to black or white. Useful for very low contrast text-background cases.
- OcrInput.ToGrayScale () - Converts pixels to grayscale. May not improve accuracy but might speed up processing.
- OcrInput.Contrast () - Automatically increases contrast, often improving speed and accuracy for low-contrast scans.
- OcrInput.DeNoise () - Removes digital noise. Only use when noise is suspected.
- OcrInput.Invert () - Inverts all colors, e.g., white becomes black and vice versa.
- OcrInput.Dilate () - Morphological operation that adds pixels to the boundaries of objects.
- OcrInput.Erode () - Morphological operation that removes pixels on object boundaries.
- OcrInput.Deskew () - Rotates image to correct orientation, improves OCR by compensating for tilted scans.
- OcrInput.DeepCleanBackgroundNoise () - Heavy noise removal. Use only when extreme background noise is present. High CPU cost.
- OcrInput.EnhanceResolution - Improves low-resolution images, automatically activated based on DPI settings.
CleanBackgroundNoise. Allows automatic cleaning of digital noise, creased paper, and other imperfections that hamper OCR.
EnhanceContrast boosts text contrast against backgrounds, improving OCR accuracy and performance.
EnhanceResolution detects low-resolution images and enhances them for optimal OCR readability, despite increased processing time.
Bahasa While IronOCR supports multiple languages, it provides specific settings to apply multiple languages for OCR operations.
Strategi Two OCR strategies: fast, less accurate versus a detailed AI-driven approach for higher text accuracy.
ColorSpace Choose grayscale or color OCR. Grayscale is preferred, but full color may improve results for similar shade texts.
DetectWhiteTextOnDarkBackgrounds. Automatically detects negative/white text on dark backgrounds, ensuring accurate reads.
InputImageType Guides OCR library for full document or snippet (e.g., screenshot) processing.
RotateAndStraighten Corrects documents with rotation and perspective issues, especially useful for photographed text.
ReadBarcodes Automatically reads barcodes and QR codes during text processing with minimal performance impact.
ColorDepth. Higher color depth may improve OCR quality but can extend processing time.
126 Paket Bahasa
IronOCR mendukung 126 bahasa internasional melalui paket bahasa yang didistribusikan sebagai DLL, yang dapat diunduh dari situs web ini , atau juga dari NuGet Package Manager .
Bahasa termasuk Jerman, Prancis, Inggris, Cina, Jepang, dan banyak lagi. Paket bahasa spesialis tersedia untuk paspor MRZ, cek MICR, Data Keuangan, Plat nomor dan banyak lagi. Anda juga dapat menggunakan file ".traineddata" tesseract - termasuk yang Anda buat sendiri.
Contoh Bahasa
Menggunakan bahasa OCR lainnya.
// using IronOcr;
// PM> Install IronOcr.Languages.Arabic
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Arabic; // Set language to Arabic
using (var input = new OcrInput())
{
input.AddImage("img/arabic.gif"); // Add Arabic image
var Result = Ocr.Read(input);
// Windows console may struggle with displaying Arabic characters.
// Save to disk instead.
Result.SaveAsTextFile("arabic.txt");
}
// using IronOcr;
// PM> Install IronOcr.Languages.Arabic
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Arabic; // Set language to Arabic
using (var input = new OcrInput())
{
input.AddImage("img/arabic.gif"); // Add Arabic image
var Result = Ocr.Read(input);
// Windows console may struggle with displaying Arabic characters.
// Save to disk instead.
Result.SaveAsTextFile("arabic.txt");
}
' using IronOcr;
' PM> Install IronOcr.Languages.Arabic
Dim Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Arabic ' Set language to Arabic
Using input = New OcrInput()
input.AddImage("img/arabic.gif") ' Add Arabic image
Dim Result = Ocr.Read(input)
' Windows console may struggle with displaying Arabic characters.
' Save to disk instead.
Result.SaveAsTextFile("arabic.txt")
End Using
Contoh Berbagai Bahasa
Juga dimungkinkan untuk OCR menggunakan beberapa bahasa pada waktu yang bersamaan. Ini benar-benar dapat membantu mendapatkan metadata dan url bahasa Inggris dalam dokumen Unicode.
// using IronOcr;
// PM> Install IronOcr.Languages.ChineseSimplified
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.ChineseSimplified; // Set primary language
Ocr.AddSecondaryLanguage(OcrLanguage.Indonesian); // Add secondary language
// Multiple languages can be added
using (var input = new OcrInput())
{
input.Add("multi-language.pdf"); // Add multi-language PDF
var Result = Ocr.Read(input);
Result.SaveAsTextFile("results.txt"); // Save results to text file
}
// using IronOcr;
// PM> Install IronOcr.Languages.ChineseSimplified
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.ChineseSimplified; // Set primary language
Ocr.AddSecondaryLanguage(OcrLanguage.Indonesian); // Add secondary language
// Multiple languages can be added
using (var input = new OcrInput())
{
input.Add("multi-language.pdf"); // Add multi-language PDF
var Result = Ocr.Read(input);
Result.SaveAsTextFile("results.txt"); // Save results to text file
}
' using IronOcr;
' PM> Install IronOcr.Languages.ChineseSimplified
Dim Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.ChineseSimplified ' Set primary language
Ocr.AddSecondaryLanguage(OcrLanguage.Indonesian) ' Add secondary language
' Multiple languages can be added
Using input = New OcrInput()
input.Add("multi-language.pdf") ' Add multi-language PDF
Dim Result = Ocr.Read(input)
Result.SaveAsTextFile("results.txt") ' Save results to text file
End Using
Objek Hasil OCR Terperinci
Besi OCR mengembalikan objek hasil OCR untuk setiap operasi OCR. Umumnya, pengembang hanya menggunakan properti teks dari objek ini untuk memindai teks dari gambar. Namun, hasil OCR DOM jauh lebih maju dari ini.
using IronOcr;
using System.Drawing; // Add assembly reference for System.Drawing
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Indonesian; // Set the language to Indonesian
Ocr.Configuration.EngineMode = TesseractEngineMode.TesseractAndLstm;
Ocr.Configuration.ReadBarCodes = true; // Enable barcode reading
using (var Input = new OcrInput(@"images\sample.tiff"))
{
OcrResult Result = Ocr.Read(Input);
var Pages = Result.Pages;
var Words = Pages[0].Words;
var Barcodes = Result.Barcodes;
// Explore for a detailed and extensive API:
// - Pages, Blocks, Paragraphs, Lines, Words, Chars
// - Extract Images, Font Coordinates, Statistical Data
}
using IronOcr;
using System.Drawing; // Add assembly reference for System.Drawing
var Ocr = new IronTesseract();
Ocr.Language = OcrLanguage.Indonesian; // Set the language to Indonesian
Ocr.Configuration.EngineMode = TesseractEngineMode.TesseractAndLstm;
Ocr.Configuration.ReadBarCodes = true; // Enable barcode reading
using (var Input = new OcrInput(@"images\sample.tiff"))
{
OcrResult Result = Ocr.Read(Input);
var Pages = Result.Pages;
var Words = Pages[0].Words;
var Barcodes = Result.Barcodes;
// Explore for a detailed and extensive API:
// - Pages, Blocks, Paragraphs, Lines, Words, Chars
// - Extract Images, Font Coordinates, Statistical Data
}
Imports IronOcr
Imports System.Drawing ' Add assembly reference for System.Drawing
Private Ocr = New IronTesseract()
Ocr.Language = OcrLanguage.Indonesian ' Set the language to Indonesian
Ocr.Configuration.EngineMode = TesseractEngineMode.TesseractAndLstm
Ocr.Configuration.ReadBarCodes = True ' Enable barcode reading
Using Input = New OcrInput("images\sample.tiff")
Dim Result As OcrResult = Ocr.Read(Input)
Dim Pages = Result.Pages
Dim Words = Pages(0).Words
Dim Barcodes = Result.Barcodes
' Explore for a detailed and extensive API:
' - Pages, Blocks, Paragraphs, Lines, Words, Chars
' - Extract Images, Font Coordinates, Statistical Data
End Using
Performa
IronOCR bekerja di luar kotak tanpa perlu menyesuaikan kinerja atau banyak memodifikasi gambar input.
Kecepatan Berkobar: IronOcr.2020 + hingga 10 kali lebih cepat dan membuat kesalahan lebih dari 250% lebih sedikit daripada versi sebelumnya.
Belajarlah lagi
Untuk mempelajari lebih lanjut tentang OCR dalam C#, VB, F#, atau bahasa .NET lainnya, silakan baca tutorial komunitas kami, yang memberikan contoh dunia nyata tentang bagaimana IronOCR dapat digunakan dan mungkin menunjukkan nuansa cara mendapatkan yang terbaik dari perpustakaan ini.
Referensi objek lengkap untuk pengembang .NET juga tersedia.