Azure Search Autoindexing DocumentDB Data

Hello.

Isn’t that nice, if you can sync (index) your DocumentDB data to Azure Search in auto schedule? Yap, so that’s what we will do today.  At the end, your DocumentDB data will automatically be indexed to Azure Search, and be able to search from Azure Search like you are searching DocumentDB data.

Azure Search Index DocumentDB Data on Schedule

Original and great walk through tutorial is here at https://azure.microsoft.com/en-us/documentation/articles/documentdb-search-indexer/.   This is my version of walking through this tutorial.

Previously I have cover a bit about Azure Search and DocumentDB so if you are not too familiar with what they are and capable of, please refer to, Mastering Azure DocumentDB Part 1 and Mastering Azure Search.

To auto index your DocumentDB to Search, you need to have two things.  1 is data source and 2 is indexer.

Continue reading “Azure Search Autoindexing DocumentDB Data”

Azure DocumentDB Limitation

The original document  is here at https://azure.microsoft.com/en-us/documentation/articles/documentdb-limits/, but I am making note to myself here too.  These are worthy to know when you are working DocumentDB and designing the service structure.

  • You can only have up to 5 database accounts by default.
  • You can only have up to 100 database per account.
  • Number of users per account is 500,000.
  • Number of permission per database account is 2,000,000.
  • Attachment storage per database account is 2GB (I ll probably use blob storage and just have uri reference to the location so I m not so worry about this limitation now)
  •  Max request units per second per collection is 2500.
  • Number of stored procedures, triggers and UDFs per collection is 25 each.
  • Max execution time for stored procedure and trigger is 5 seconds.
  • Provisioned document storage / collection is 10GB.
  • Max collections per database account is 100. (so that’s 500 max in account)
  • Max document storage per database 1TB. = Number of collection is up to 100 and that has to be less than 1TB.
  • Max length of the ID property is 255 characters.
  • Max items per page is unlimited. There is no limit here but if I must say that’s 1TB worthy of documents to page.
  • Max request size of document and attachment is 512kb.
  • Max response size is 1MB. (how big or small is 1MB of JSON file?)
  • All request or when you talk to documentDB must be UTF-8 encode standard.
  • Max number of UDFs per query is 2.
  • Max number of JOINs per query is 5.
  • Max number of AND caluse per query is 20.
  • Max number of values per IN expression is 200.
  • Max number of points in a polygon argument in a ST_WITHIN query is 16 (don’t know what this is.. I will look into it later)
  • Max number of collection creates per minute is 5.
  • Max number of scale operations per minute is 5.