Friday 28 September 2012

Data Querying Options (SharePoint 2010) – Server OM vs Client OM vs REST vs LINQ vs Search API


SharePoint 2010 allows various options to query SharePoint data within SharePoint and from outside of SharePoint.It allows both server side and client side data access technologies. This article will provide high level features, pros, and cons of each server side and client side data access technologies. Additionally, SharePoint allows developers to write custom WCF Services and write custom queries based on Enterprise Search API. Hopefully at the end of this article, it would allow you to make decision which option would work best for appropriate use cases.

Server Side Object Model -  SPSite, SPWeb, SPList, SPDocumentLibrary, SPListItem, SPView, SPField, SPContentType etc.
  • Major features
    • Implemented as Microsoft.SharePoint Namespace
    • Code must run on the SharePoint Servers
    • Programmatic approach to interact with SharePoint objects
    • APIs to provision and maintain building blocks like lists, site columns, content types, or pages
    • APIs to iterate through farm, web application, site, web, list, list item, fields, content types collections
  • Pros
    • Virtually any APIs or objects available to access SharePoint data
    • Easy to learn API, familiar concepts such as foreach() to loop through collections
    • Supported in Sandbox solutions. Site collection level APIs works great in Sandbox Solutions
  • Cons
    • Not available on the remote, Non-SharePoint servers
    • foreach(), GetItemById, GetListItems increase resource utilization and may affect performance
    • Requires IISREST while deploying farm level code on the server, Sandbox solutions doesn’t require IISReset
Server Side Object Model – SPQuery and SPSiteDataQuery
  • Major features
    • SPQuery available since SharePoint 2003 days and SPSiteDataQuery available since MOSS 2007 days
    • Implemented as Microsoft.SharePoint Namespace
    • Code must run on the SharePoint Servers
    • Programmatic approach to interact with SharePoint objects
    • SPQuery executes query against single list or performs join with CAML Predicate on multiple lists on lookup columns with in site collection
    • SPDataQuery executes query  against multiple lists to  aggregate results, scoped at site collection level.
    • SPDataQuery returns data as ADO.NET data table. It has potential performance issues for rollup views because it first loops through all the lists and then process the query – multiple round trips
    • Fastest query option but good only for small scopes
    • Developer must parse CAML fragments and access to columns is not strongly typed
    • SPQuery can query external list data within a site
  • Pros
    • CAML is very fast – compared to foreach, fastest way to data access
    • Supports complex queries, such as sorting, filtering, and joining on multiple sources, much better options compared to SPList.GetListItems
    • Supported in Sandbox solutions. Site collection level APIs works great in Sandbox Solutions
    • Query Throttling override is supported
      • You must use this option to execute queries that return large sets, by default more than 5000 records are blocked, these settings are web application level, defined in central admin, must adhere in Office 365
      • QueryThrottleMode = SPQueryThrottleOption.Override, Object Model Override = Yes, and Query Runs under Super Users (User with Full Read or Full Control Permission)
      • Use List Level Indexing in conjunction with SPQuery to avoid List Throttling limit
  • Cons
    • Not available on the remote, Non-SharePoint servers
    • Small scope, e.g. targeted to a particular list or library
    • Affected by List Throttling Limit (default List View Threshold is >5000)
    • CAML means low developer productivity – Hard coded column names, not strongly typed, quirky syntax, lack of tooling support, and difficult debugging
    • Requires IISREST while deploying farm level code on the server, Sandbox solutions doesn’t require IISReset
Server Side Object Model – LINQ to SharePoint
  • Major features
    • New in SharePoint 2010
    • Implemented as Microsoft.SharePoint.Linq Namespace
    • Executes query against single list or performs join on multiple lists on lookup columns with in site collection, You can’t aggregate multiple lists data without lookup column
    • Microsoft supplies spmetal.exe to generate LINQ to SharePoint entities
    • Strongly typed entities to perform CRUD operations against SharePoint data
    • Translates into CAML query during runtime
  • Pros
    • Not fastest as CAML but still very fast compare to foreach, Approx. 10% slower than SPQuery
    • Performs compile time errors against strongly typed objects from spmetal.exe
    • High productivity, IntelliSense, and less error-prone programming
    • Supported in Sandbox solutions.
  • Cons
    • Needs to regenerate the entities if List Schema changes
    • Query Throttling override is not supported
    • Small scope, e.g. targeted to a particular list or library
    • SPMetal can’t generate entity classes for more than one web
    • Although its minor, additional overhead occurs while converting query to CAML during runtime
    • Can’t query external list data within a site
    • Requires IISREST while deploying farm level code on the server, Sandbox solutions doesn’t require IISReset
Client Side Object Model
  • Major features
    • New in SharePoint 2010,  Subset of server OM functionality but similar APIs
    • Implemented as WCF Service – Client.svc, Adding Client.svc reference directly in Visual Studio is prohibited
    • Three different Models - .NET Managed, Silverlight, ECMAScript/JavaScript
      • For Javascript CSOM, reference SP.js is located under : %ProgramFiles%/Common Files/Microsoft Shared/Web Server Extensions/14/TEMPLATE/LAYOUTS
      • For Silverlight CSOM, Add the reference to Microsoft.SharePoint.Client.Silverlight.dll  and Microsoft.SharePoint.Client.Silverlight.Runtime.dll located under %ProgramFiles%/Common Files/Microsoft Shared/Web Server Extensions/14/TEMPLATE/LAYOUTS/ClientBin
      • For .NET Manage Client CSOM, Add reference to Microsoft.SharePoint.Client.dll and        Microsoft.SharePoint.Client.Runtime.dll located under %ProgramFiles%/Common Files/Microsoft Shared/web server extensions/14/ISAPI
    • CSOM executes query against single list or performs join with CAML Predicate on multiple lists on lookup columns with in site collection
    • Typically .NET Managed CSOM code will run outside of SharePoint in remote .NET or CRM applications, Silverlight CSOM is used only in Silverlight applications, and JavaScript CSOM  is must be used only in SharePoint Site, JCSOM can’t be used in a classic ASP.NET Site due to cross-scripting issue.
  • Pros
    • Ideal for accessing cross farm data when Search API is not an option or real-time remote data access is required
    • Allows submitting CAML queries directly from client side logic, making it fastest client side data access option
    • Supports Async processing, Request Throttling, and Batch object handling
    • CSOM can query external list data within a site
    • Supports Site Collection level SharePoint Object Model. It doesn’t provide administrative objects or objects that are scoped higher than site collection.
    • No IISREST is required while deploying a Silverlight or JavaScript CSOM based implementation. e.g. JavaScript CSOM can be implemented with just a Content Editor Web part.
  • Cons
    • Can’t access data outside of current site collection
    • Query Throttling override is not supported
    • Since these are web services interface, can’t accessed from Sandbox Solutions
    • Requires learning of CAML to query small result set – Hard coded, Un complied, Not strongly typed CAML query
    • Can’t elevate the user privilege using RunWithElevatedPrivilege or Impersonate using CSOM. It means, results retrieved using CSOM are always security trimmed by default.
    • Requires IISREST while deploying farm level, .NET managed CSOM code on the server
REST Based API – WCF Data Services
  • Major Features
    • New in SharePoint 2010
    • Implemented as REST based WCF Data Services – ListData.svc, REST is HTTP-based XML data transfer design pattern based on OData protocol – Stateless, cacheable, uniform, Able to maintain & browse lists using HTTP verbs – GET, POST, PUT, DELETE, and MERGE
    • Returns results in ATOM Feed or JSON format (For JavaScript clients)
    • Available for Lists (ListData.svc) and Excel Services REST API (ExcelRest.aspx)
    • Add Service Reference to ListData.svc, Proxy class provides LINQ-based strongly-typed access
  • Pros
    • Confronts Open Data (OData) protocol to support batching, concurrency, and partial updates
    • Client-side technique with strongly typed access, Intelligence Support, LINQ style productivity, and compile time errors checking against from proxy class
    • Can be used in Silverlight as a replacement for Client OM for strongly typed access
    • Can be query from any kind of client : Silverlight, Javascript, and even from PHP and JAVA
  • Cons
    • Supported only for Lists and Excel Services
    • Needs to refresh the proxy class if List Schema changes
    • Since these are web services interface, can’t accessed from Sandbox Solutions
    • Overhead of three Query conversations from URL based REST Queries to LINQ Queries to CAML Queries
    • Can be used against only SPList, can’t used against SPWeb, SPSite, or SPUser, doesn’t support SharePoint Object Model
    • Can’t query external list data within a site
    • Can’t query more than 1000 items at a time
SOAP Based API – ASMX Web Services
  • Major Features
    • Uses Legacy ASMX  ASP.NET Web Services technology
    • Deprecated, other options should be preferred over this
    • Scoped at the List or Library level data access, requires custom web services for multiple list joins
    • Small subset of APIs compared to SSOM
    • Provides legacy approach for remote data access
    • Developer must parse CAML fragments and access to columns is not strongly typed
    • Used in conjunction with SPServices for jQuery clients, SPServices supports Anonymous access and cross- site collection and cross-domain access
  • Pros
    • Ideal for accessing cross farm data when Enterprise Search API is not an option but CSOM should given preference here for remote cross farm data access in farm level solutions
    • Ideal for Office 365 Cross Site Collection data access in Sandbox Solutions in conjunction with SPServices, none of the other options like CSOM or REST APIs would work in sandbox
    • Supports advanced SharePoint Server capabilities like search, social, or user profile services, which are unavailable through the CSOM or REST APIs
    • Supports access from JavaScript and non-windows clients
  • Cons
    • Requires learning of CAML to query small result set – Hard coded, not strongly typed CAML query, no intelisense
    • Difficult to use in current format, often custom wrapper is required
    • Can be used against only SPList, Doesn’t support SharePoint Foundation Object Model
Custom WCF Services
  • Major features
    • Create your own web service interface for SharePoint
    • Extend out of box SharePoint functionality, requires to expose SharePoint functionality that is unavailable through the existing APIs
    • Can be hosted in SharePoint, IIS, or any other .NET client
  • Pros
    • Ideal for third party vendors and ISVs
    • Ideal for organizations to integrate enterprise systems with SharePoint as service oriented architecture
  • Cons
    • Requires knowledge of complex WCF Services development
Enterprise Search API
  • Major Features
    • Query against Search Index – KeywordQuery and FullTextQuery API
    • This is Enterprise Search capability, not FAST Search
    • Perfect for Cross-farm, Cross-Web Application, or Cross-Site Collection Data Access
    • Use Search API to query large amount of data
  • Pros
    • Cross boundaries, including: farm, web application, site collections, sub sites, libraries, etc.
    • Ideal for Cross-Site Collection Navigation Menu
    • Can query across large data corpuses
    • Queries are fast even with huge data corpuses
    • Greater scalability, All other techniques are mostly SQL-based
  • Cons
    • Requires advanced search service administration and configuration.
    • Content needs to be in the indexed (dependency)
    • Content must be tagged with metadata (to be effective)
    • Crawl properties must be in index and Managed properties must be mapped to crawled properties.
Comparing SharePoint 2010 Server Side Data Technologies

No comments:

Post a Comment