This blog post is a simple example (cheat sheet) of listing the collections for a project in Watson Discovery using cURL and the IBM Cloud Watson Discovery API V2. You can get more details in the IBM Cloud Watson Discovery API documentation.
1. Log on to IBM Cloud
ibmcloud login (-sso)
REGION=us-south
GROUP=default
ibmcloud target -r $REGION -g $GROUP
2. Get your instance ID
DISCOVERY_SERVICE_NAME="YOUR_SERVICE"
WATSON_DISCOVERY_INSTANCE_ID=$(ibmcloud resource service-instance $DISCOVERY_SERVICE_NAME | grep "GUID" | awk '{print $2;}')
3. List the projects
See Authentication in the API documentation
OAUTHTOKEN=$(ibmcloud iam oauth-tokens | awk '{print $4;}')
DISCOVERY_URL="https://api.$REGION.discovery.watson.cloud.ibm.com/instances/$WATSON_DISCOVERY_INSTANCE_ID"
VERSION="2023-03-31"
curl -X GET -H "Authorization: Bearer $OAUTHTOKEN" "$DISCOVERY_URL/v2/projects?version=$VERSION"
Example output:
{
"projects" : [ {
"project_id" : "YYYYY",
"type" : "document_retrieval",
"name" : "Sample Project 1",
"collection_count" : 1
}, {
"project_id" : "ZZZZZ",
"type" : "document_retrieval",
"name" : "Sample Project 2",
"collection_count" : 1
} ]
or use the API key invocation.
APIKEY=YOUR_APIKEY
curl -X GET -u "apikey:$APIKEY" "$DISCOVERY_URL/v2/projects?version=$VERSION"
4. List the collections
OAUTHTOKEN=$(ibmcloud iam oauth-tokens | awk '{print $4;}')
PROJECT_ID=ZZZZZZ
curl -X GET -H "Authorization: Bearer $OAUTHTOKEN" "$DISCOVERY_URL/v2/projects/$PROJECT_ID/collections?version=$VERSION"
Example output:
{
"collections" : [ {
"name" : "Collection1",
"collection_id" : "ZZZZZ_ZZZZZZ"
} ]
5. Run a query
COLLECTION_ID_1=ZZZZZ_ZZZZZZ
COLLECTION_ID_2=YYYYY_YYYYYY
QUERY_TEXT="Credit card usage in Europe"
curl -X POST -H "Authorization: Bearer $OAUTHTOKEN" --header "Content-Type: application/json" --data "{ \"collection_ids\": [ \"${COLLECTION_ID_1}\", \"${COLLECTION_ID_2}\" ], \"query\": \"text:${QUERY_TEXT}\" }" "$DISCOVERY_URL/v2/projects/${PROJECT_ID}/query?version=$VERSION" | jq '.'
Example output:
{
"matching_results": 7,
"retrieval_details": {
"document_retrieval_strategy": "untrained"
},
"results": [
{
"document_id": "XXXX,
"result_metadata": {
"collection_id": "ZZZZZ_ZZZZZZ"
},
"metadata": {
"parent_document_id": "YXY",
"customer_id": "YXY"
},
"extracted_metadata": {
"sha1": "XXXX",
"numPages": "52",
"filename": "XXX.docx",
"author": [
"Thomas Suedbroecker"
],
"file_type": "word",
"text_mappings": "{"text_mappings":[{"page":{"page_number":1,
"bbox":[111,111,111,111]}
"field": ... }
"title": "",
"publicationdate": "20XX-XX-XX"
},
"html": [ ... ],
"text": [ ... ],
"table_results_references": []
}
],
I hope this was useful to you, and let’s see what’s next?
Greetings,
Thomas
#curl, #watsondiscovery, #api, #watson, #ibmcloud, #ai
