Oracle APEX: How to extract text from inside a PDF or Word Doc

preview_player
Показать описание
In this video we will show you how to quickly extract all the text from inside a pdf or word document and store the information inside a CLOB for quick retrieval.

Our video begins with a table containing a BLOB column with several pdf's and .doc's. We will start by creating a filter on the table to index the documents leveraging CTXSYS.CONTEXT and CTXSYS.AUTO_FILTER.

With the information inside a CLOB Oracle APEX can now easily display and process the text to users at a fraction of the time it would take to display or download the BLOB.

Sample Code leveraged in this video:

Related Videos of Interest:
How to print a CLOB inside a Dialog Window in Oracle APEX

How to enable Full Text Search on a BLOB

Oracle APEX: How to add an Image as BLOB to Existing Table/Form/Report
Рекомендации по теме
Комментарии
Автор

This was super helpful. I'm passing the text from the PDF to the Cohere API and asking questions about what's in the PDF and it works like a dream. Thank you!

jeeves
Автор

wonderful! I've been trying to convert a PDF/BLOB to CLOB in APEX for days!! Thank you!

organismisimbiotici
Автор

Is there a way that I can do all this, but instead of inserting resumes into the table before the time, uploading them from a page in the app where it gets saved into the table? If so, how do I do this?

LiandiObermeyer
Автор

at here in oracle g11, does not work. Just staying an ' - ' after sucess process, do you know why?

kauecastelani
Автор

Hi., Same like can we extract the image from a word/docx file.?

uselvan
Автор

Hi, I have done all steps and works really fine, but when I try to convert a large pdf (54 pages) it only give me a string like this SKM_C3320i23022111200 in the filtered_docs table. I'm wondering if there's any limit of size with this index ?

luisf.rodriguezgarcia
Автор

Hello, can we upload records from oracle apex to an MDB format file?

asifiqbal
Автор

Hi I am getting with CTX_DOC package, I am using 21.2 apex version and Database 12C

ORA-20000: Oracle Text error:
DRG-50857: oracle error in ctx_doc.filter
ORA-20000: Oracle Text error:
DRG-11207: user filter command exited with status 127

deepakdakhore