Logging in with "Administrator" |
---|
as you may have noticed, in versions 1.9.8.2.7 and above, the Casebrowser will now follow Patricia´s User permissions. |
The below documented diagnosis and maintenance endpoints are partially modifying your document repository by deletion/removal of documents. Be sure you understand exactly what you do before using the endpoints against your production repository.
1. Get processing info
This servlet has not input parameters and returns elasticsearch configuration and number of processed documents in categories: textExtraction, mailAnalysis, previewGenerating, metadataSync.
piTextExtractionLogicVersion, mailAnysLogicVersApplied, previewLogicVersApplied, metadataSyncLogicVersion, piFullTextTransferLogicVersion == NULL || {<1, <3, <3, <1, <1} for unprocessed files. For piTextExtractionLogicVersion, mailAnysLogicVersApplied, previewLogicVersApplied, metadataSyncLogicVersion, piFullTextTransferLogicVersion == {1, 3, 3, 1, 1} indicates that files are processed. If any of piTextExtractionLogicVersion, mailAnysLogicVersApplied, previewLogicVersApplied, metadataSyncLogicVersion, piFullTextTransferLogicVersion == 99 this indicates that a file has caused an error during processing. Such file will be skipped.
Endpoint: http://<server_host>/casebrowser/admin/processing-info
Example: http://nux7.priv.pace-ip.com:9080/casebrowser/admin/processing-info
Run from within the Linux Console of your eDMS Appliance:
docker exec -ti nuxeo curl -X POST -H "Content-Type: application/json" -u Administrator:Administrator -d '{}' localhost:8080/nuxeo/site/automation/ProcessingInfo
2. Reset servlet
This servlet reset piTextExtractionLogicVersion, mailAnysLogicVersApplied, previewLogicVersApplied, and metadataSyncLogicVersion (since version 1.9.8.2.2) fields for documents in a provided case or for all documents in DB. Starting version 1.9.8.2.4-2, this servlet also provides for a removal of extracted text from documents from the postgres db to which the full-text-transfer sweeper copies the elastic search data to (to increase stability and speed on certain operations).
The endpoint has two input parameters:
- case - case reference (f.e. P1046DEEP), or
- type - can have value "
COMPLETE
" or,- as of version 1.9.8.2.3, value "
EMPTY
" when resetting piTextExtractionLogicVersion, or - as of version 1.9.8.2.4-1, value "
EMPTY
" when resetting previewLogicVersApplied
- as of version 1.9.8.2.3, value "
Parameter "COMPLETE
" causes the fields textExtraction, mailAnalysis, previewGenerating, metadataSync, respectively, be reset for ALL documents in the DB (regardless of case parameter)
Parameter "EMPTY
" causes the field textExtraction to be reset for only those documents in the DB that have an empty index, e.g. due to incorrect OCRing. The type="EMPTY
" parameter can be combined with the case=<case reference> to limit the reset to a specific case by using "?case=<case reference>&type=EMPTY
".
case or type parameter must be provided to start.
Important: It is important to understand that, depending on you repository size, this endpoint may consume high CPU and should not be used during production hours. Also, this endpoint is meant to be an admin tool and should not be used by "standard" users. To enforce this, you must log into Casebrowser using the "Administrator" account before calling this endpoint from the same browser.
Endpoints:
- http://<server_host>/casebrowser/admin/reset/preview-analysis - to reset previewLogicVersApplied
http://<server_host>/casebrowser/admin/reset/email-analysis - to reset mailAnysLogicVersApplied
http://<server_host>/casebrowser/admin/reset/text-analysis - to reset piTextExtractionLogicVersion
http://<server_host>/casebrowser/admin/reset/metadata - to reset metadataSyncLogicVersion --- as of version 1.9.8.2.2
- http://<server_host>/casebrowser/admin/reset/full-text-transfer - to remove postgres textdata store --- from version 1.9.8.2.4-2 'til 1.9.8.2.9
Examples: http://nux7.priv.pace-ip.com:9080/casebrowser/admin/reset/preview-analysis?case=P1046DEEP
3. Start SweepRunner
This servlet fires an event to start SweepRunners for (1) text extraction, (2) email analysis, (3) preview generation, – since version 1.9.8.2.2 – (4) metadata sync, and – since version 1.9.8.2.4-2 – (5) start-full-text-transfer. Sweep runner will run until (a) the next office hours starting point (6:00 am) is reached (is.xxx.office.hours = "") or (b) all files are processed (when is.xxx.office.hours = TRUE – see PAT_DMS_SETTINGS configuration). If SweepRunner is already running, manually starting will not have an effect.
Starting SweepRunner during office hours will also start the process. Consider SweepRunner CPU hit when starting during office hours !
Important: It is important to understand that, depending on you repository size, this endpoint may consume high CPU and should not be used during production hours. Also, this endpoint is meant to be an admin tool and should not be used by "standard" users. To enforce this, you must log into Casebrowser using the "Administrator" account before calling this endpoint from the same browser.
Endpoints:
- http://<server_host>/casebrowser/admin/run/start-preview-processing
- http://<server_host>/casebrowser/admin/run/start-ocr-processing
- http://<server_host>/casebrowser/admin/run/start-email-processing
- http://<server_host>/casebrowser/admin/run/start-metadata-sync-processing --- as of version 1.9.8.2.2
- http://<server_host>/casebrowser/admin/run/start-full-text-transfer --- as of version 1.9.8.2.4-2
Example: http://nux7.priv.pace-ip.com:9080/casebrowser/admin/run/start-ocr-processing
Legacy (1.9.7.1 and less):
In old function, sweep runner for all three analyses was started; no distinction.
Endpoint: http://<server_host>/casebrowser/admin/run/nightly-batch
4. Get documents index information
This endpoint takes a case name as input parameter and returns elasticsearch index value for all documents in the case.
Input parameter: case_number - case name (f.e. P1046DEEP)
Important: This endpoint is meant to be an admin tool and should not be used by "standard" users. To enforce this, you must log into Casebrowser using the "Administrator" account before calling this endpoint from the same browser.
Endpoint: http://<server_host>/casebrowser/admin/documents-index-info
Example: http://nux7.priv.pace-ip.com:9080/casebrowser/admin/documents-index-info?case_number=P1046DEEP
Sample output: sample-case-info.json
5. Get DMS system status information – available in DMS version 1.9.8+
This endpoint checks availability of all services DMS depends on. If any dependency is unavailable or times out, the servlet returns a FAIL message. Otherwise OK is returned. The servlet also returns the installed DMS version.
Endpoint: http://<server_host>/casebrowser/admin/status
Example: http://nux7.priv.pace-ip.com:9080/casebrowser/admin/status
Sample output when all dependencies are working:
Casebrowser version: 1.9.8.1149.20160729112112 DMS System OK: Nuxeo database OK: jdbc:postgresql://postgres:5432/nuxeo Patricia database OK: jdbc:sqlserver://192.168.72.130:1433 Nuxeo web service OK: dms:8080 is accessible Postfix OK: dms:25 is accessible Elasticsearch OK: ES status = GREEN Casebrowser web service OK: cb:9080 is accessible
6. RENAMED as of 1.9.8.2.8 – Verify and fix sync status of documents – available in DMS version 1.9.8.2+
This endpoint has been renamed to /PdlCheckSync starting version 1.9.8.2.8+. The deprecated name was "check_sync"
This endpoint lets you check the sync status for documents in a given period of time. If sync discrepancies exist, ie. when documents are present in DMS but not in PAT_DOC_LOG they can be resolved by this endpoint and the information synced so that the documents also correctly show in the Patricia documents tab. The opposite is not possible, ie. entries in PAT_DOC_LOG not pointing to a valid entry in DMS (eg. because of having been manually deleted in an older version of the DMS) cannot be corrected but these entries must be deleted manually via Patricia or SQL script.
The sync endpoint allows you to specify a "Start date" and an "End date" (defining a period for which sync discrepancies should be checked) as well as a document type (eg. eml or doc) to limit the output to documents of that type. The list that is output allows you to select unsynced documents and re-sync the information with PAT_DOC_LOG by pressing the "Sync" button.
The pdlCheckSync endpoint does not work against so called name-documents. These documents will be ignored in the sync checking process.
Note: As of version 1.9.8.2.3, the endpoint allows deleting documents in the DMS that have no entry in PAT_DOC_LOG. This is, however, only possible during import, i.e. while file.import.complete == 'FALSE'. Make sure you understand the impact such a delete action has on your data before using it.
Important: It is important to understand that, depending on you repository size and the date range set, this endpoint may consume high CPU and should not be used during production hours. Also, this endpoint is meant to be an admin tool and should not be used by "standard" users. To enforce this, you must log into Casebrowser using the "Administrator" account before calling this endpoint from the same browser.
Starting with version 1.9.8.2.9-2 possibility to synchronise documents without UI is added with two new request parameters, adding any of them switches endpoint to non-UI mode:
immediateSync=true creates PAT_DOC_LOG entries for all DMS documents that have no relevant entry
immediatePatDocLogDelete=true deletes all PAT_DOC_LOG entries that have no corresponding DMS document
These endpoints can be combined with any other parameters of the endpoint
(for example,
http://<server_host>/casebrowser/admin/PdlCheckSync?usePdlCompare=true&immediateSync=true&immediatePatDocLogDelete=true
will create PAT_DOC_LOG entries for all DMS documents that have no relevant entry in PAT_DOC_LOG table and will delete all PAT_DOC_LOG entries that have no corresponding DMS document in Global Sync mode from paragraph 26,
http://<server_host>/casebrowser/admin/PdlCheckSync?startDate=2022-01-01&endDate=2023-01-01&immediateSync=true&immediatePatDocLogDelete=true
will do the same operations, but only for documents created in 2022)
Endpoint: http://<server_host>/casebrowser/admin/PdlCheckSync
Example: http://nux7.priv.pace-ip.com:9080/casebrowser/admin/PdlCheckSync
7. Purging deleted documents – available in DMS version 1.9.8.2+
This endpoint lets you delete documents from the database that have been deleted in the DMS. Note that documents deleted using this endpoint are finally deleted from the database and can no longer be recovered from the DMS trash.
Important: It is important to understand that, depending on you repository size, this endpoint may consume high CPU and should not be used during production hours. Also, this endpoint is meant to be an admin tool and should not be used by "standard" users. To enforce this, you must log into Casebrowser using the "Administrator" account before calling this endpoint from the same browser.
Further important note: To regain the disk space the deleted documents occupy, after deleting the documents from the database using the endpoint, the binaries of these documents will need to be removed using the nuxeo web-console by the following simple steps:
- Log into Nuxeo web console as administrator;
- Go to "Admin" > "System Information" > "Repository binaries";
- Check the Delete orphaned binaries check box. If you just want to gather statistics about what it going to be removed from disk, don't check this box and go next step;
- Click on Mark orphaned binaries.
This will terminally remove all binaries that are no longer used by the Nuxeo DMS system and free up their disk space.
Endpoint: http://<server_host>/casebrowser/admin/purgetrash
Example: http://nux7.priv.pace-ip.com:9080/casebrowser/admin/purgetrash
Run from within the Linux Console of your eDMS Appliance:
docker exec -ti nuxeo curl -X POST -H "Content-Type: application/json" -u Administrator:Administrator -d '{}' localhost:8080/nuxeo/site/automation/purgetrash
8. Purging email attachments that are no longer linked to emails – available in DMS version 1.9.8.2+
This endpoint lets you search and purge all email attachments in the system that are no longer linked to emails. This can happen, for example, if emails are deleted from the system.
Important: It is important to understand that, depending on you repository size, this endpoint may consume high CPU and should not be used during production hours. Also, this endpoint is meant to be an admin tool and should not be used by "standard" users. To enforce this, you must log into Casebrowser using the "Administrator" account before calling this endpoint from the same browser.
Optional input parameter: dryrun - by default, the endpoint will run with dryrun=true (no data will be altered). To alter data, set dryrun=false.
Endpoint: http://<server_host>/casebrowser/admin/purge_unlinked_attachments
Example 1 (will not alter data): http://nux7.priv.pace-ip.com:9080/casebrowser/admin/purge_unlinked_attachments
Example 2 (will alter data): http://nux7.priv.pace-ip.com:9080/casebrowser/admin/purge_unlinked_attachments?dryrun=false
Run from within the Linux Console of your eDMS Appliance:
docker exec -ti nuxeo curl -X POST -H "Content-Type: application/json" -u Administrator:Administrator -d '{}' localhost:8080/nuxeo/site/automation/purge_unlinked_attachments?dryrun=false
9. Purging temp documents – available in DMS version 1.9.8.2.3-3+
This endpoint lets you search and purge all temp documents in the system. Temp documents are documents starting with "._" or "~$", or have the extensions ".tmp".
Optional input parameter: dryrun - by default, the endpoint will run with dryrun=true (no data will be altered). To alter data, set dryrun=false.
Important: It is important to understand that, depending on you repository size, this endpoint may consume high CPU and should not be used during production hours. Also, this endpoint is meant to be an admin tool and should not be used by "standard" users. To enforce this, you must log into Casebrowser using the "Administrator" account before calling this endpoint from the same browser.
Endpoint: http://<server_host>/casebrowser/admin/delete_temp_documents
Example: http://nux7.priv.pace-ip.com:9080/casebrowser/admin/delete_temp_documents
Run from within the Linux Console of your eDMS Appliance:
docker exec -ti nuxeo curl -X POST -H "Content-Type: application/json" -u Administrator:Administrator -d '{}' localhost:8080/nuxeo/site/automation/delete_temp_documents
10. Removing entries from the dbo.saved_emails table for deleted emails – available in DMS version 1.9.8.2.3-3+
This endpoint lets you search and purge all entries from the dbo.saved_emails table for emails that had been deleted from the repository in earlier versions. This process will be necessary to align the dbo.saved_emails table entries with the repository following upgrade to version 1.9.8.2.3-3+.
Important: It is important to understand that, depending on you repository size, this endpoint may consume high CPU and should not be used during production hours. Also, this endpoint is meant to be an admin tool and should not be used by "standard" users. To enforce this, you must log into Casebrowser using the "Administrator" account before calling this endpoint from the same browser.
Optional input parameter: dryrun - by default, the endpoint will run with dryrun=true (no data will be altered). To alter data, set dryrun=false.
Endpoint: http://<server_host>/casebrowser/admin/purge_removed_emails
Example 1 (will not alter data): http://nux7.priv.pace-ip.com:9080/casebrowser/admin/purge_removed_emails
Example 2 (will alter data): http://nux7.priv.pace-ip.com:9080/casebrowser/admin/purge_removed_emails?dryrun=false
11. Check license status – available in DMS version 1.9.8.2+
This endpoint returns the license status of the current DMS installation as JSON. The parameters returned are:
- license key valid (true or false)
- number of licenses (ie. number of concurrent users allowed to log onto the DMS)
- expiration date of the license
Endpoint: http://<server_host>/casebrowser/admin/license-status
Example: http://nux7.priv.pace-ip.com:9080/casebrowser/admin/license-status
Sample output: {"expirationDate":"2017-10-31","numberOfLicenses":37,"valid":true}
12. Check license use – available in DMS version 1.9.8.2.4-2+
This endpoint returns the license usage statistics of the DMS installation at the time the endpoint is called.
- User ID
- number of sessions (ie. the user being logged into the DMS through different browsers)
- number of licenses consumed (per license, a user can have 3 active sessions - see before)
Endpoint: http://<server_host>/casebrowser/admin/license-use
Example: http://nux7.priv.pace-ip.com:9080/casebrowser/admin/license-use
13. Deleting documents from repository that were created after import was finished (ie. roll-back to imported state) – available in DMS version 1.9.8.2.2+
As of version 1.9.8.2.2, the DMS will tag documents during import as "imported documents" as opposed to documents that are being created in the DMS after an import has finished. This endpoint deletes all documents that were created in the DMS post import (eg. during a test phase) and roll-back to the state when the import finished. A second import can then be run to perform a differential import (repo re-sync) so as to live rollout a client system.
As a security measure, this endpoint will only work if key file.import.completed = FALSE in PAT_DMS_SETTINGS table of Patricia database, which needs to be set when a differential import (repo re-sync) is run.
Note: This endpoint also works for documents that have been added during testing in the names section of Patricia (i.e. so called name-documents). Any name documents added after the import will be removed.
Important: It is important to understand that, depending on you repository size, this endpoint may consume high CPU and should not be used during production hours. Also, this endpoint is meant to be an admin tool and should not be used by "standard" users. To enforce this, you must log into Casebrowser using the "Administrator" account before calling this endpoint from the same browser.
Optional input parameter: dryrun - by default, the endpoint will run with dryrun=true (no data will be altered). To alter data, set dryrun=false.
Endpoint: http://<server_host>/casebrowser/admin/delete_nonimported
Example 1 (will not alter data): http://nux7.priv.pace-ip.com:9080/casebrowser/admin/delete_nonimported
Example 2 (will alter data): http://nux7.priv.pace-ip.com:9080/casebrowser/admin/delete_nonimported?dryrun=false
Run from within the Linux Console of your eDMS Appliance:
docker exec -ti nuxeo curl -X POST -H "Content-Type: application/json" -u Administrator:Administrator -d '{}' localhost:8080/nuxeo/site/automation/delete_nonimported?dryrun=false
14. Comparing documents in the DMS Documents tree with a file system tree (eg. from which an import ran) – available in DMS version 1.9.8.2.4-2+
This endpoint lets you run a comparison of the documents in the DMS Documents tree (eg. /Workspaces/Patricia/Documents/) with a corresponding file system tree from which a DMS documents import had run. This way, it can be confirmed that for every document in the file system documents tree, a corresponding document in the DMS tree exists. The comparison only checks for the existence of corresponding documents; it does not verify the integrity of corresponding documents; integrity checks are done at the import stage.
The comparison outputs a list of documents that are present in the file tree where no corresponding document could be found in the DMS Documents tree. The opposite does not produce any output, ie. when there is a match. Additionally, the endpoint will provide some statistics on the comparison results.
For this endpoint to run, it is necessary to modify the commands.conf
file by adding a mountpoint for the file system tree and the import result to the configuration for the nuxeo container and redeploy the environment (see Auto-deploy script). Specifically, the addition can be as follows in the "NUXEO" section of the commands.conf
file (with <importPath> representing the file system tree path, eg. "/import"):
elif [ ${1} = "NUXEO" ] then add_port 8080 add_volume ${NUXEO_CONFDIR} "/etc/nuxeo" add_volume ${NUXEO_DATADIR} "/var/lib/nuxeo/data" add_volume ${NUXEO_WINFONTDIR} "/usr/share/fonts/custom-fonts" add_volume ${NUXEO_LOGDIR} "/var/log/nuxeo" add_volume ${NUXEO_TEMPDIR} "/opt/nuxeo/server/tmp" add_volume <importPath> "/home/nuxeo/import" ## this line needs to be added; default is /home/nuxeo/import. <importPath> needs to hold the Patricia "Documents" folder add_volume "/root/deploy/import_script/importer" "/home/nuxeo/importer" ## this line needs to be added; it holds the "urls.txt" file generated by the import script
Important: It is important to understand that this endpoint consumes high CPU and, in particular, als cause very high storage i/o. It should not be used during production hours. Also, this endpoint is meant to be an admin tool and should not be used by "standard" users. To enforce this, you must log into Casebrowser using the "Administrator" account before calling this endpoint from the same browser.
Optional input parameters:
- importedDocumentsPath defaults to "/home/nuxeo/import/Documents". This parameter points to the ../Documents/ directory in the path of the file tree containing imported documents.
- urlsPath defaults to "/home/nuxeo/importer/urls.txt". "urls.txt" is an output file created automatically when running the import script. It contains a list of the case folders to be compared.
- start defaults to 0. This denotes the first case to be compared from the "urls.txt" control file; the number represents the line number in the "urls.txt" file (less the first "logActivate" line).
- end defaults to 0. this denotes the last case to be compared from the "urls.txt" control file; If end == 0 or end < start or end > the number of lines in the "urls.txt" file, the compare will be executed until till the end of the "urls.txt" control file.
Endpoint: http://<server_host>/casebrowser/admin/operation/CompareImported?importedDocumentsPath=<importDocumentsPath>&urlsPath=<urlsPath>&start=<start>&stop=<stop>
Post processing: After the comparison has concluded, it is recommended to remove or comment out the above additions to the commands.conf
file and redeploy the environment once more without such mount points. After this, the file system tree paths can be removed from the DMS VM.
15. Creating folders in DMS for empty Patricia cases – available in DMS version 1.9.8.2.3+
During import, when a Patricia case contains no documents, no corresponding case folder will be created in the DMS (as there is none in Patricia). This endpoint will create empty folders in the DMS for those cases where creation of corresponding folders failed during import. It is vital to have these folders correctly created so that automatic saving of emails works for these cases.
As a security measure, this endpoint will only work if key file.import.completed = FALSE in PAT_DMS_SETTINGS table of Patricia database, which needs to be set when a differential import (repo re-sync) is run.
Note: This endpoint is not destructive so it is considered relatively safe to use.
Important: It is important to understand that, depending on you repository size, this endpoint may consume high CPU and should not be used during production hours. Also, this endpoint is meant to be an admin tool and should not be used by "standard" users. To enforce this, you must log into Casebrowser using the "Administrator" account before calling this endpoint from the same browser.
Optional input parameter: dryrun - by default, the endpoint will run with dryrun=true (no data will be altered). To alter data, set dryrun=false.
Endpoint: http://<server_host>/casebrowser/admin/case_folder_sync
Example 1 (will not alter data): http://nux7.priv.pace-ip.com:9080/casebrowser/admin/case_folder_sync
Example 2 (will alter data): http://nux7.priv.pace-ip.com:9080/casebrowser/admin/case_folder_sync?dryrun=false
16. Syncing locally installed fonts to nuxeo (for use in email drafts) – available in DMS version 1.9.8.2.4-1+
Calling this endpoint will synchronise the fonts installed in /storage/nuxeo/fonts/ to the nuxeo repository (/Workspaces/Patricia/Settings/EmailFonts/). In order to make the locally installed fonts available to the DMS email editor, these fonts must be in the nuxeo repository.
Note: This endpoint is automatically executed on startup of the DMS. Also note that this endpoint is meant to be an admin tool and cannot be used by "standard" users. You must log into Casebrowser using the "Administrator" account before calling this endpoint from the same browser.
Endpoint: http://<server_host>/casebrowser/admin/run/setup-email-fonts
Example: http://nux7.priv.pace-ip.com:9080/casebrowser/admin/run/setup-email-fonts
17. Deleting duplicate files from the repository - available in DMS version 1.9.8.2.4-1+
This endpoint deletes duplicates of documents having an identical path. Such documents may have been created because of a WebDAV access to the repository (which is strongly recommended against) or manual manipulations. When invoked, the endpoint checks if any duplicates exist and removes such duplicates, leaving the original document untouched.
This endpoint is generally only necessary after importing an existing Patricia repository.
The endpoint has three input parameters:
- caseRef - case reference (e.g. P1046DEEP), or
- type - can have value "COMPLETE",
- dryRun - by default, the endpoint will run with dryRun=true (no data will be altered). To alter data, set dryRun=false.
caseRef or type parameter must be provided to start.
Important: It is important to understand that this endpoint consumes high CPU and, in particular, als cause very high storage i/o. It should not be used during production hours. Also, this endpoint is meant to be an admin tool and should not be used by "standard" users. To enforce this, you must log into Casebrowser using the "Administrator" account before calling this endpoint from the same browser.
Endpoint: http://<server_host>/casebrowser/admin/operation/DeleteDuplicates
Example 1 (will not alter data): http://nux7.priv.pace-ip.com:9080/casebrowser/admin/operation/DeleteDuplicates?caseRef=P1000EP00
Example 2 (will alter data): http://nux7.priv.pace-ip.com:9080/casebrowser/admin/operation/DeleteDuplicates?caseRef=P1000EP00&dryRun=false
18. Normalizing filenames after import - available in DMS version 1.9.8.2.4-1+
This endpoint normalises filenames of imported document blobs. Sometimes imported documents have multiple spaces in filenames, nuxeo replaces multiple spaces with single for these filenames automatically in nuxeo document name and path but it leaves the document's blob name unchanged. This can cause issues while opening these documents with DocIntegrate.
This endpoint is generally only necessary after importing an existing Patricia repository.
The endpoint has three input parameters:
- caseRef - case reference (e.g. P1046DEEP), or
- type - can have value "COMPLETE",
- dryRun - by default, the endpoint will run with dryRun=true (no data will be altered). To alter data, set dryRun=false.
caseRef or type parameter must be provided to start.
Important: It is important to understand that this endpoint consumes high CPU and, in particular, als cause very high storage i/o. It should not be used during production hours. Also, this endpoint is meant to be an admin tool and should not be used by "standard" users. To enforce this, you must log into Casebrowser using the "Administrator" account before calling this endpoint from the same browser.
Endpoint: http://<server_host>/casebrowser/admin/operation/NormalizeFilenamesAfterImport
Example 1 (will not alter data): http://nux7.priv.pace-ip.com:9080/casebrowser/admin/operation/NormalizeFilenamesAfterImport?caseRef=P1000EP00
Example 2 (will alter data): http://nux7.priv.pace-ip.com:9080/casebrowser/admin/operation/NormalizeFilenamesAfterImport?caseRef=P1000EP00&dryRun=false
19. Removing multiple spaces from PatDocLog - available in DMS version 1.9.8.2.4-1+
This endpoint removes multiple spaces from the filename and document name fields of Pat_Doc_Log entries in the Patricia database. Sometimes imported documents have multiple spaces in filenames, nuxeo replaces multiple spaces with single for these filenames automatically in nuxeo document name and path but is not updating filename and document name fields of Pat_Doc_Log table. This makes impossible to work with these documents in Patricia.
This endpoint is generally only necessary after importing an existing Patricia repository.
The endpoint has three input parameters:
- caseRef - case reference (e.g. P1046DEEP), or
- type - can have value "COMPLETE",
- dryRun - by default, the endpoint will run with dryRun=true (no data will be altered). To alter data, set dryRun=false.
caseRef or type parameter must be provided to start.
Important: It is important to understand that this endpoint consumes high CPU and, in particular, als cause very high storage i/o. It should not be used during production hours. Also, this endpoint is meant to be an admin tool and should not be used by "standard" users. To enforce this, you must log into Casebrowser using the "Administrator" account before calling this endpoint from the same browser.
Endpoint: http://<server_host>/casebrowser/admin/operation/RemovePatDocLogMultipleSpaces
Example 1 (will not alter data): http://nux7.priv.pace-ip.com:9080/casebrowser/admin/operation/RemovePatDocLogMultipleSpaces?caseRef=P1000EP00
Example 2 (will alter data): http://nux7.priv.pace-ip.com:9080/casebrowser/admin/operation/RemovePatDocLogMultipleSpaces?caseRef=P1000EP00&dryRun=false
20. Removing stored versions of documents - available in DMS version 1.9.8.2.4-1+
This endpoint removes versions of existing documents in the DMS repository leaving only the most recent version available. The DMS create these versions on every modification of a document with or when a new version of the document is uploaded in CaseBrowser.
This endpoint will purge the versions so that they can no longer be recovered. Please use this endpoint with extreme caution.
The endpoint has three input parameters:
- caseRef - case reference (e.g. P1046DEEP), or
- type - can have value "COMPLETE",
- dryRun - by default, the endpoint will run with dryRun=true (no data will be altered). To alter data, set dryRun=false.
caseRef or type parameter must be provided to start.
Important: It is important to understand that this endpoint consumes high CPU and, in particular, als cause very high storage i/o. It should not be used during production hours. Also, this endpoint is meant to be an admin tool and should not be used by "standard" users. To enforce this, you must log into Casebrowser using the "Administrator" account before calling this endpoint from the same browser.
Further important note: To regain the disk space the deleted documents occupy, after removing the document versions from the repository using the endpoint, the binaries of these documents will need to be removed using the nuxeo web-console by the following simple steps:
- Log into Nuxeo web console as administrator;
- Go to "Admin" > "System Information" > "Repository binaries";
- Check the Delete orphaned binaries check box. If you just want to gather statistics about what it going to be removed from disk, don't check this box and go next step;
- Click on Mark orphaned binaries.
This will terminally remove all binaries that are no longer used by the Nuxeo DMS system and free up their disk space.
Endpoint: http://<server_host>/casebrowser/admin/operation/RemoveVersions
Example 1 (will not alter data): http://nux7.priv.pace-ip.com:9080/casebrowser/admin/operation/RemoveVersions?caseRef=P1000EP00
Example 2 (will alter data): http://nux7.priv.pace-ip.com:9080/casebrowser/admin/operation/RemoveVersions?caseRef=P1000EP00&dryRun=false
21. Removing duplicates of emailAttachments folders - available in DMS version 1.9.8.2.4-1+
This endpoint removes duplicates of email attachment folders for the entire DMS repository. Duplicate folders are sometimes created during email attachment processing using multiple cores.
The endpoint has one input parameter:
dryRun - by default, the endpoint will run with dryRun=true (no data will be altered). To alter data, set dryRun=false.
Important: It is important to understand that this endpoint consumes high CPU and, in particular, als cause very high storage i/o. It should not be used during production hours. Also, this endpoint is meant to be an admin tool and should not be used by "standard" users. To enforce this, you must log into Casebrowser using the "Administrator" account before calling this endpoint from the same browser.
Endpoint: http://<server_host>/casebrowser/admin/operation/PurgeDuplicateAttachments
Example 1 (will not alter data): http://nux7.priv.pace-ip.com:9080/casebrowser/admin/operation/PurgeDuplicateAttachments
Example 2 (will alter data): http://nux7.priv.pace-ip.com:9080/casebrowser/admin/operation/PurgeDuplicateAttachments?dryRun=false
22. Advanced document processing reset servlet - available in DMS version 1.9.8.2.4-1+
This endpoint allows reseting piTextExtractionLogicVersion, mailAnysLogicVersApplied, previewLogicVersApplied, and metadataSyncLogicVersion fields for documents matching the provided NXQL query. See also item 2. Reset servlet
The endpoint has three input parameters:
- params - params to reset. This has to be a comma separated list in the format schema:propertyName. E.g.: "pifile:mailAnysLogicVersApplied,pifile:previewLogicVersApplied".
- nxql - NXQL query to get documents to modify
- dryRun - by default, the endpoint will run with dryRun=true (no data will be altered). To alter data, set dryRun=false.
params and nxql parameter must be provided to start.
Important: It is important to understand that this endpoint consumes high CPU and, in particular, als cause very high storage i/o. It should not be used during production hours. Also, this endpoint is meant to be an admin tool and should not be used by "standard" users. To enforce this, you must log into Casebrowser using the "Administrator" account before calling this endpoint from the same browser.
Endpoint: http://<server_host>/casebrowser/admin/operation/ResetDocumentsOperation
Example 1 (will not alter data): http://nux7.priv.pace-ip.com:9080/casebrowser/admin/operation/ResetDocumentsOperation?params=pifile:mailAnysLogicVersApplied&nxql=SELECT * FROM File
Example 2 (will alter data): http://nux7.priv.pace-ip.com:9080/casebrowser/admin/operation/ResetDocumentsOperation?params=pifile:mailAnysLogicVersApplied&nxql=SELECT * FROM File&dryRun=false
23. Document export servlet - available in DMS version 1.9.8.2.4-1+
This endpoint allows exporting documents from the DMS to the /storage/nuxeo/data/export/ directory.
The directory needs to have full read/write access for the nuxeo process. please execute
chown 1000:1000 /storage/nuxeo/data/export/ -R
If not present, the directory will be created upon startup of the export command. Please redeploy your DMS before running the export command to make sure the latest version is installed.
The endpoint has the following input parameters:
- caseRefs - a comma separated list of case references (e.g. "P1000DE00,P1000AUPC,D1038ES00"), or
- type - can have value "COMPLETE", "ACTORS", or "READFROMFILE",
- includedCategories - a comma separated list of Patricia category IDs
caseRefs or type parameter must be provided to start; if the type parameter has a valid definition, the caseRef parameter is ignored. In the comma separated list of case references, any certain reserved characters (e.g. "&", "@", "?", "$") must be replaced by their corresponding encoded form. Please see this link for details on the encoding of specific characters.
includedCategories parameter limits the export to documents in the listed categories. The parameter is optional; if unspecified documents of all categories are exported. Documents with "No Category" can be specified by an id "0". An exemplary setting would be "includedCategories=0,20011" to export documents in the category 20011 as well as documents without a category.
For a READFROMFILE-type export, the endpoint expects a text document "casesToExport.txt" in /storage/nuxeo/data/export/ which provides a list of cases to export. The file needs to specify one caseRef per each line (not a comma separated list).
Note: The COMPLETE-type exports documents for cases only. The ACTORS-type exports only documents for actors.
Important: It is important to understand that this endpoint may cause high storage i/o. Its use should be avoided during production hours. Also, this endpoint is meant to be an admin tool and should not be used by "standard" users. To enforce this, you must log into Casebrowser using the "Administrator" account before calling this endpoint from the same browser.
Endpoint: http://<server_host>/casebrowser/admin/operation/ExportDocuments
Example 2: http:/nux7.priv.pace-ip.com:9080/casebrowser/admin/operation/ExportDocuments?type=COMPLETE
When using the document export servlet for the entire data, please consider the following:
- The systems disk size should be set to have a maximum of 40% occupancy (40% current data, 40% space for exported data, 20% general spare for functionality)
- If possible double the heap space
- Manual redeploy the system prior to running the endpoint
- The general estimated time for an average system to process 10k cases is at approximately 45minutes.
24. Update index structure to support case-insensitive search - available in DMS version 1.9.8.2.4-2 - 1.9.8.2.9-2
This endpoint will remove and trigger recreation of the nuxeo-internal index so as to allow case insensitive text searches against your repository. The index content will be saved and restored after the index structure is recreated. The operation will to be triggered if piFullTextTransferLogicVersion shows unprocessed documents.
The endpoint has one input parameter:
dryRun - by default, the endpoint will run with dryRun=true (no data will be altered). To alter data, set dryRun=false.
Important: It is important to understand that this endpoint consumes high CPU during index recreation and, in particular, als cause very high storage i/o. It should not be used during production hours. Also, this endpoint is meant to be an admin tool and should not be used by "standard" users. To enforce this, you must log into Casebrowser using the "Administrator" account before calling this endpoint from the same browser.
Note: Index restore may take several hours depending on your repository size during which your DMS system cannot be used in production. Please time your index recreation accordingly.
Endpoint: http://<server_host>/casebrowser/admin/operation/EnableCaseInsensitiveSearch
Example 1 (will not alter data): http://nux7.priv.pace-ip.com:9080/casebrowser/admin/operation/EnableCaseInsensitiveSearch
Example 2 (will alter data): http://nux7.priv.pace-ip.com:9080/casebrowser/admin/operation/EnableCaseInsensitiveSearch?dryRun=false
This endpoint is removed in 1.9.8.2.10
25. Global Compare - available in DMS version 1.9.8.2.5-1+
This endpoint lets you run a comparison of the documents in the DMS Documents tree (eg. /Workspaces/Patricia/Documents/) with the corrresponding Entries in the PAT_DOC_LOG (PDL) table of Patricia. There are two types of output depending on the result of the comparison. The list of documents which are in DMS but not in PDL is in one table, the list of documents which are in PDL but not in DMS is in another table.
Progress information will be written to the nuxeo container log.
Starting with version 1.9.8.2.8, the number of log entries depends on the batch size for the comparison to be set in PAT_DEMS_SETTING "pdlcompare.batch.size". Default batch size is 100k documents which delivers reasonably detailed progress information to the log and is of adequate size to run well on system with average JVM heap space availability for the nuxeo and cb containers. Smaller batch sizes will have a negative impact on speed of the operation; larger batch sizes will increase overall speed, however, require larger heap space availability (only suggested for large environments).
Both tables are in the Patricia SQL DB
- docInDMS_notPDL
- docInPDL_notDMS
They are fetching their data from the temporary tables
- checkSync_nuxeo_data
- checkSync_pdl_data
Endpoint: http://<server_host>/casebrowser/admin/operation/PdlCompare
Example: http://172.20.30.50:9080/casebrowser/admin/operation/PdlCompare
Important: this endpoint is meant to be an admin tool and should not be used by "standard" users. To enforce this, you must log into Casebrowser using the "Administrator" account before calling this endpoint from the same browser.
26. Global Sync mode of the PdlCheckSync endpoint (item 6) - available in DMS version 1.9.8.2.8+
The PdlCheckSync endpoint has a specific mode that extends the endpoint described in item 6.
This mode relies on the availability of the tables generated by the Global Compare endpoint (item no. 25) so it is always a second step after having run the Global Compare endpoint.
Progress information will be written to the casebrowser container log.
The PAT_DEMS_SETTING "pdlcompare.batch.size" will also be effective and defined the batch size for this operation. Default batch size is 100k documents which delivers relatively scarce progress information to the log but is of adequate size to run well on system with average JVM heap space availability for the nuxeo and cb containers. Smaller batch sizes will have an impact on speed of the operation, however, lead to more frequent log updates; larger batch sizes will increase overall speed, however, require larger heap space availability and reduce log feedback (only suggested for large environments).
Endpoint: http://<server_host>/casebrowser/admin/PdlCheckSync?usePdlCompare=true
Example: http://172.20.30.50:9080/casebrowser/admin/PdlCheckSync?usePdlCompare=true
Important: this is meant to be an admin tool and should not be used by "standard" users. To enforce this, you must log into Casebrowser using the "Administrator" account before calling this endpoint from the same browser.
27. Move Case endpoint - available in DMS version 1.9.8.2.9-2+
To provide input to the MoveCase cases_to_move.txt file must be created in nuxeo data folder, by default it’s /storage/nuxeo/data.
Each line in this file defines single task to move case in the next format:
${sourceCasePath} ${targetCasePath}
For example file that looks this way:
2/1000/DE/00 2/1000/DE/01
2/1001/DE/EP 2/1234/DE/EP
Will move P1000DE00 case to P1000DE01 and P1001DEEP to P1234DEEP.
When the operation is finished, result will be shown in the browser and in the cases_to_move.result.txt file in the same folder.
Endpoint: http://<server_host>/casebrowser/admin/operation/MoveCase
Example: http://172.20.30.50:9080/casebrowser/admin/operation/MoveCase
Important: this is meant to be an admin tool and should not be used by "standard" users. To enforce this, you must log into Casebrowser using the "Administrator" account before calling this endpoint from the same browser.
Important: this is an operation that takes long time, so you need to stop service container to avoid daily restart and update default DMS timeout with next addition to the dms.conf:
CUSTOM_PARAMETERS="nuxeo.db.transactiontimeout=172800"
28. Delete Case endpoint - available in DMS version 1.9.8.2.9-2+
To provide input to the DeleteCase cases_to_delete.txt file must be created in nuxeo data folder, by default it’s /storage/nuxeo/data.
Each line in this file defines case path to be deleted:
For example file that looks this way:
2/1000/DE/00
2/1001/DE/EP
Will delete P1000DE00 and P1001DEEP cases.
When the operation is finished, result will be shown in the browser and in the cases_to_delete.result.txt file in the same folder.
All the deleted case folders are moved to the Nuxeo trash and can be restored from there. To free the disk space user must clear trash folder and delete orphaned binaries after invoking Delete Case endpoint.
Endpoint: http://<server_host>/casebrowser/admin/operation/DeleteCase
Example: http://172.20.30.50:9080/casebrowser/admin/operation/DeleteCase
Important: this is meant to be an admin tool and should not be used by "standard" users. To enforce this, you must log into Casebrowser using the "Administrator" account before calling this endpoint from the same browser.
Important: this is an operation that takes long time, so you need to stop service container to avoid daily restart and update default DMS timeout with next addition to the dms.conf:
CUSTOM_PARAMETERS="nuxeo.db.transactiontimeout=172800"