Web Service API
This page documents the Spring Boot service entry point that exposes document-processing, OCR, redaction, watermarking, and AI-assisted classification endpoints.
Scope
The service module is a direct HTTP API. It is separate from the desktop client and web client, and it focuses on file transformation plus text and classification utilities.
Primary Entry Point
The application is defined in WebServiceApplication.
That class is both:
- the Spring Boot application entry point
- the REST controller for the service endpoints
The main() method loads the iText license key, sets the server port to 8081, and starts the application.
Exposed Endpoints
The current controller surface includes:
POST /redactfor PDF redaction using a multipart file and a redaction payloadPOST /searchfor locating matching text ranges in a PDFPOST /watermark/addfor applying a watermark to a PDFPOST /watermark/removefor removing a watermarkPOST /watermark/getfor listing detected watermarksPOST /ocr/getfor returning OCR output as JSON or plain data depending on the enginePOST /ocr/addfor adding an OCR layer to uploaded filesPOST /ocr/textfor returning OCR text outputPOST /ai/classify/etmffor eTMF content-type predictionPOST /ai/document/detailsfor combined document AI outputGET /demofor the demo landing pagePOST /demo/ai/classifyfor demo classification text outputPOST /demo/ner/stanfordfor Stanford NER outputPOST /demo/ner/opennlpfor OpenNLP-style entity extraction outputPOST /demo/ocr/addfor demo OCR PDF generation
Supporting Services
The controller delegates work to the service and utility layer, including:
- OcrUtils
- VisionApiUtils
ItextUtilsfor PDF cleanup, redaction, and watermark handlingClassificationUtilsfor PDF text extraction and classification helpersVertexAiUtilsfor eTMF classification and document AI outputTextUtilsfor named-entity extractionDocumentAiDetailsas the combined AI response object
OCR Behavior
OcrUtils supports three engine modes:
documentAifor document AI processingocrfor Google Vision OCRtesseractfor local HOCR-oriented processing
VisionApiUtils bridges Google Vision OCR and OCR-layer generation over PDF and image inputs.
Implementation Notes
- Most endpoints operate on multipart file uploads and return generated files or structured JSON directly.
- The service writes temporary files during processing rather than streaming transformations in place.
- The AI endpoints combine OCR text, entity extraction, and content-type predictions to support downstream document triage.
- The demo endpoints are intentionally separate from the main API surface and are useful as reference behavior for the helpers.
Authentication and Deployment Context
Invocation model
WebServiceApplication is an internal utility microservice. It is not an end-user-facing API and does not expose an authentication layer. The service is expected to be co-deployed with the main SureClinical web application, isolated behind the application server's network boundary.
| Property | Value |
|---|---|
| Default port | 8081 (set in main() via server.port) |
| Auth mechanism | None — network-level access control only |
| Expected callers | The SureClinical web application server (internal service-to-service calls) |
| Public exposure | Should not be directly accessible from the internet |
Security constraints
- All endpoints accept multipart file uploads. No session token, API key, or OAuth credential is validated.
- The iText license key is loaded on startup from a classpath or file-system resource — it must be present for PDF operations to succeed.
- Temporary files are written during processing. The working directory must be writable and on a local, trusted volume.
- Demo endpoints (
/demo/*) are included in the same application context. These expose representative request/response behavior and should be disabled or network-restricted in production deployments.
Hardening checklist
- Restrict port 8081 to localhost or a private network interface in production.
- Ensure the directory used for temporary file writes is not web-accessible.
- Remove or firewall the
/demo/*routes in hardened deployments. - Audit multipart upload size limits —
WebServiceApplicationdoes not document aspring.servlet.multipart.max-file-sizesetting; default Spring Boot limits apply unless overridden.