How to store files in SQL Server Database without increasing its size or losing a link to it?
The ability to store files and to connect them to other important information from a database solution is an important advantage for a company. Very often, we might encounter the problem of establishing a solution which stores files, but on the other side the database size is increasing rapidly or the files are not accessible on the network path they were supposed to be stored on. In the past, there had been attempts to do that and even some solutions. So, historically speaking the most popular ways for doing that were:
1) storing the file in a database as a binary large object (BLOB), and
2) storing a network path in the database while the files are stored somewhere in the file system.
Both approaches are good, but they are two, not only one approach that will comprise all of the aforementioned features. In this case, Microsoft offers the best solution by combining the most adequate features of both of them. Therefore we can have better and optimized applications that store data appropriately. The new approach can offer a lot, starting from having metadata when the user stored the file, the type of it, the last time it was updated, the last time it was accessed and much more.
FileTable, an extension of FileStream feature as part of Microsoft SQL Server offers all of the aforementioned solutions. We, at InterWorks use this extension to help our clients develop more accessible and practical database solutions where they will get optimal solution for storing files.
What is FileStream and FileTable and what can we do with them?
The FileStream feature was introduced within SQL Server 2008. This feature is used for storing unstructured data, such as word documents, presentations, videos, audios and images on the file system with a pointer to that data in the database. The storage of the unstructured data in FileStream improves the performance by leveraging the NTFS APIs streaming along with additional benefits of the database system.
SQL Server 2012 enhanced FileStream even further by introducing FileTable feature. It allows an application to integrate its storage and data management components to allow non-transactional access and provide integrated SQL Server services like semantic search and full-text search over unstructured data.
FileStream data type is implemented as a varbinary (max) column in which the data is stored as a BLOB in the NTFS or ReFS file systems and a pointer to the data in the database. In this case, the storage is not restricted to 2GB as it is in the case of regular BLOB storage, meaning that the size of the FileStream columns are limited only by the volume of the file system.
To specify a column to store BLOB data on the file system, we need to specify the FileStream attribute on a varbinary(max) column. This causes the SQL Server Database Engine to store all data for that column on the file system instead of storing it in the database itself. According to Microsoft recommendations if the stored objects are on average larger than 1 MB, FileStream is the best option because of the faster read access. For smaller objects, storing varbinary(max) BLOBs in the database often provides better streaming performance.
FileTable is implemented using FileStream and enhances its capability even further. It allows direct, transactional access to large data, but in contrast to FileStream, the FileTable feature can be configured to allow non-transactional access as well, i.e. it enables access to files without prior authorization from the Database Engine. FileTable is a specialized user table with a pre-defined schema that stores FILESTREAM data, as well as file and directory hierarchy information and file attributes. The access is different and quite flexible. It is up to the user to define whether the data participates in transactions, and to what extent. Additionally, the access to the contents of a file system through the operating system can be enabled, thus the access through T-SQL can be bypassed. There is even an option for adding contents by pasting the files in the FileTable host directory. These new files are also visible to SQL Server.
Additionally, we can configure FileTable separately from FileStream having in mind that it is the only way for non-transactional access to FileStream. Even if non-transactional access is enabled, the already existing FileStream columns and applications will continue to behave in the same way as they did before. The non-transactional access can be managed, enabled or disabled, at the database level.
Comparison of FileStream and FileTable
|Feature||FileStream Solution||FileTable Solution|
|Single story for management tasks||Yes||Yes|
|Single set of services for search, reporting, querying, etc.||Yes||Yes|
|Integrated security model||Yes||Yes|
|In-place updates of FILESTREAM data||No||Yes|
|File and directory hierarchy maintained in the database||No||Yes|
|Windows application compatibility||No||Yes|
|Relational access to file attributes||No||Yes|
We have spent some time on comparing other features connected with T-SQL that can be used for storing files and we chose FileTable. When compared to the other features of SQL for storing files, it has many advantages and to name a few:
- Keeps information about:
- type of file
- file size cached
- the time when the file was created
- the last time the file was edited
- the last time the file was accessed
- if the file is directory
- if the file is offline
- if the file is hidden
- if the file is read-only
- if the file is an archive
- if the file is a system file
- if the file is temporary
- Every FileTable row represents a directory or a file;
- In order to maintain file namespace semantics, FileTable applies certain system-defined constraints and triggers;
- A FileTable represents a hierarchy of directories and files, where the related data to the nodes in that hierarchy is stored. This hierarchy starts from a root directory which is specified when the FileTable is created;
- For non-transactional file access, Windows API based solutions can access the stored file and directory data in the FIleTable through a Windows share.
- Whenever there are calls for creating or changing a file or directory through the Windows share, SQL Server component intercepts them and they are reflected in the corresponding relational data in the FileTable;
- We can query and update the FIleTable through normal Transact-SQL access. They are also integrated with SQL Server management tools, and features such as backup;
- Support of 3 actions: create, alter and drop of a FileTable;
Setting up FileStream and FileTable in SQL Server
- Enable FILESTREAM at the Instance level: SQL Server Configuration Manager -> Properties -> FileStream -> Set the desired options
- Enable FILESTREAM access level with Transact-SQL code
- Create a Database
- Create a FileTable
Accessing the share location where the data for a particular FileTable is stored
- Explore FileTable Directory with SQL Server Management Studio: Expand Tables -> FileTables -> Right-click on the created FileTable -> Choose option Explore FileTable Directory option
It will be good to have …
Both FileStream and FileTable features are not available on SQL Server 2017 on Linux due to being tightly integrated with the Windows operating system. Having these features ported on Linux will expand the possibilities of this platform.
Additionally, FileStream and FileTable features are also not available on SQL Azure Database. However, Azure cloud platform has a separate service, Azure Blob storage for storing unstructured data as objects/blobs.
FileStream and FileTable are features of SQL Server for storing unstructured data in SQL Server alongside other data. The FileStream feature stores unstructured data in the file system and keeps a pointer of the data in the database, whereas FileTable extends this feature even further allowing non-transactional access. One of the features of the non-transactional access is the ability to access files without prior authorization from the Database Engine from the shared location. In other words, the FileTable feature gives us the notion of managing files in the file system rather than in an SQL Server and in the same time the data can be accessed in a transactional way in the SQL Server as well.
Marija Karanfilovska and Goran Zdraveski