Blob metadata

Mar 15, 2009 at 4:32 PM
When archiving a blob, it would be desirable to store it with it's metadata. Is this possible?
Mar 16, 2009 at 8:08 AM
RBS library API, manages the reference to the remote blob (metadata) within SQL Server and the BLOB data is totally  managed by the external Content Addressable Stores (CAS) system. The CAS might have its own metadata. Any operations at the storage level (CAS) is not going to be notified or propagated back to RBS Library.  

Mike/Pradeep may shed more light on this.....

-Saju
Mar 17, 2009 at 6:33 PM
I think the answer to your question is "it depends."

It will depend on whether the "Remote Blob Store provider" you use can support the storage of metadata with a blob.  This will require that the provider work to use the available features of the RBS API to pass that metadata to the backend storage device when an RBS store/write occurs.  I'm pretty sure that nobody is providing this functionality at this time.   I believe the Centera team at EMC would consider trying it if they had a current customer requirement for the feature.

I'd also like to hear if Mike/Pradeep have any more to say on the subject. ;-)

John
Editor
Mar 17, 2009 at 7:03 PM

rglissmann, can you please elaborate what scenario you are thinking of? We think of RBS metadata as all the RBS tables themselves – they contain all collection, pool and blob info. Sometimes it is not enough to just know that BlobId corresponding to a blob in the store – you also need to know which collection it belongs to and whether/when it was deleted etc.

 

The contract with blob stores allows RBS to move the blob around from one collection or even database to another without the blob store knowing about it. Moreover, RBS can even make even bigger changes such as mapping multiple user blobs to one store blob (de-duplication), mapping one user blob to multiple store blobs (composite blobs/appending to blobs) etc (these things are not implemented today but nothing prevents them from being implemented in the future). Due to all these changes, the only up-to-date metadata that can be trusted lies in the database itself (RBS tables/views). So the recommended way of having a backup of the metadata is to have a backup of the database itself (or at least RBS tables/filegroup).

 

Let me know whether I answered your question and if you have any further questions. We would also be interested in knowing what scenario you had in mind.

 

Regards,

Pradeep.

Mar 17, 2009 at 7:25 PM
Pradeep,

The situation I'm thinking about is archiving documents for compliance. If the blob needs to be saved for seven years, will the metadata need to be online in the database for that period? Maybe there could be hierarchies of databases where metadata that is associated with archived blobs are relegated to a "archive" database. The benefit here is to save money by supporting that database on cheaper storage because it doesn't need to be high performance. The functionality needed is knowing about the metadata associated with a blob.

-Randy
Editor
Mar 17, 2009 at 7:39 PM

You can always take a backup of the database and store the .bak file on cheap storage. This metadata backup is “passive” or “point-in-time” and doesn’t need to be maintained in the online database.

 

You may have to do something special in case somebody tries to delete the blob from the online database – you would have to lock down the blob from being deleted by the blob store.

 

Mar 17, 2009 at 8:11 PM
FYI an EMC Centera can support retention periods on blobs, so if someone tries to delete them from the Centera device before the retention period expires the Centera will not allow it.  I believe "retention" is usually a feature of CAS type devices.  Currently there is no way to control the retention features of a Centera device through RBS (via the Centera RBS provider.)  (Again, I beleive that EMC would work with anyone that had a requirement for the feature.)

John