Tuesday, February 20, 2007

Manage Large SharePoint Lists for Better Performance

Large lists have become fairly common and can cause the server and users considerable pain if you’re not careful.  When I refer to a list in SharePoint, that includes document libraries, calendars, contacts, tasks, etc.

A SharePoint list containing thousands or even millions of items is not necessarily problematic if well managed. As a general guideline, any list should not have more than 2,000 items per list container. By container, I mean the root of the list, or any folder inside it. So for example, if you have a list with 1,900 list items in the list root, plus 100 folders each containing 2,000 items, then you are fine. Just make sure that your views do not attempt to show or edit more than 2,000 items at a time.  In total, counting all items recursively in folders, SharePoint supports up to 5 million items in each list.

If you try to select approximately 5,000 items or more simultaneously for reading or update, the database (MS SQL Server) will typically lock the entire table for the duration of that change. This means that all other operations (read/write) performed on any other list in any site collection in the same database are queued until this large transaction is complete and the lock is released. This is also the case if the query or view you are using retrieves data across multiple folders within the list or across different lists, regardless of recursive nesting of folders as per the guidelines above. To avoid running into this locking behavior, make sure the number of items you retrieve in a single request is well below this 2,000-item threshold. For example, you can control the number of records returned by setting the item limit when creating or modifying your view. 











While this doesn’t guarantee performance gain, if coincidentally the 100 items you are requesting are scattered across the large list, this is typically a useful way to manage and improve the performance of your large list.

Another way to limit the number of items in a container is to divide items into folders. The next figure shows the relative performance between folder views, when folders are used to store and organize documents, and an indexed view of a flat library structure. Each folder contains 500 documents created by different users. In this scenario, there is no significant throughput degradation up to 1 million documents for either scenario, provided that the number of items in the view does not exceed the performance threshold for your system. However, performance is better when folders are used. So if we compare retrieving 2,000 items from a large flat indexed list with 'view by folder', we find that the latter provides better performance. 


As the number of items in a folder increases, folder view performance will gradually degrade. Note that the above results are estimates based on our testing, and results may vary in your environment.

Indexing your lists properly plays an important role as well in the performance of your views, when used in coordination with applying filters on those indexed columns. For example, if you have a column named "Color" and you index it, then retrieving all the "Red" items will in most cases perform better than if you had not indexed that field, because your view will not have to scan all your items to find the red ones. So if you happen to have 2,000 items in your list, but only 300 red ones, then SharePoint will not have to scan the entire list to retrieve the 300 requested by  your view, and that makes for better performance than if you were to perform the same query without that index.

Here's a screenshot of the page from which you edit your indexed columns: 



















For step by step instructions on how to create an index, see Manage lists and libraries with many items.

As you create views, it's important to remember that SharePoint will only use the first index you specify in your filter on the Edit View or Create View page. For example, let's say you have a list with some columns, two of which are Color and Shape, and both of these columns are indexed. When you create a new view, you can filter on Color (Indexed) = Red and Shape (Indexed) = Square (in that order). 


















 However, your view won’t benefit from both of the indexes. That’s  because SharePoint will get all the Red items (using the index for the Color column), and then scan those items for Square shaped ones, without using the index for the Square column. So in that case if you have 1000 red items, and only 5 square ones total in your list, then it's best to make sure that your filter is on Shape=Square and then Color=Red instead of vice versa. That way, SharePoint will get the Square items (just 5) and scan them for Red ones, which will perform a lot better than scanning all 1000. 

















For step by step instructions on how to create a filtered view using an indexed field, see Manage lists and libraries with many items.