BLOBS vs DOCUMENTS in the SHR

Tucker_Mark · August 20, 2013, 1:42pm

We need to be careful about whether we are talking about Images or Documents. And, within Images, we should distinguish PACS vs Scanned images.

We (Regenstrief/Indiana) currently store decades of documents in our database.

The text documents fit in easily, and are snappy.

The rub happens when they are scanned documents …. They really start feeling like images, and you start thinking of out-of-line storage. We certainly do not
store Radiology images ….Our architecture includes a box labeled PACS to handle them.

I would strongly prefer to store text reports within the RDB. Most text documents are very small, and I would think any EMR worth its salt already supports
documents. From an architectural point of view those EMRS, when acting as SHR’ handle documents internally.

We also have plenty of good experience storing pointers to out-of-line images within the RDB.

In our legacy system, the scanned documents (which occupy orders of magnitude less space than the PACS uses) are stored in a separate BLOB table, outside of
the clinical discrete results, and outside of our text-report-body table.

In our new system, we mix the scanned images with text-report-bodies. I’m not thrilled with that, but it works.

(I like it separate because it is easier to manage smaller tables. Plus, by using a separate RDB table, it makes it easier to swap that out and swap in some
sort of image store if the storage problem gets out of hand.)

On Behalf Of Ryan Crichton

···

Hi Kari, Carl,

Thanks for raising these issues. This is definitely the right time to be delving into some of the more design specific details of the SHR. Our focus for the coming calls will be to discuss these issues in detail and produce design documentation
describing our feel for the best way to approach these issues.

I agree storing documents in a relation database may not be the most ideal. It would be interesting to explore how OpenXDS manages documents, that could help us in identifying a direction that is known to work.

Cheers,

Ryan

On Fri, Aug 16, 2013 at 1:18 PM, Carl Fourie carl@jembi.org wrote:

Thanks for raising this issue Kari (and for the offline discussion). I think these are great issues to think of addressing as part of the design concepts as we move forward. Looking forward to seeing the rest of the group weigh in too.

On a first pass my thought is that we need to decide at a high level the best method to handle document data types (my preference is a document based db – my bias is now out there) and then figure out the technical challenges in this.
As you have raised there are concerns around the management of referencial integrity between the two as well as the ability to audit too. Personally I think we can design a solution that allows us to mitigate it but I would love to hear the larger groups thoughts
on this too.

Cheers,

Carl

On Fri, Aug 16, 2013 at 11:44 AM, Kari Schoonbee kari@jembi.org wrote:

Thanks Ryan, I think OpenMRS is definitely a good choice and I feel confident that we can build a great SHR tool around it.

I wonder if we are ready to start delving a little bit more into the design of the SHR around OpenMRS now? As stated in the recommendation documents, the largest issue we face around OpenMRS is the storing of document-based data. I’ve been
reading up on the pro’s and con’s of storing blobs in relational databases and think this will make for quite an interesting discussion.

Personally I feel we should NOT be storing BLOBS (images etc) directly in the obs table. This could lead the database to grow in size very quickly and from what I gather it will lead to much slower performance, as the database is usually
the bottleneck. It also consumes more space than storing files directly on the filesystem and it makes your backup process much more cumbersome. For example, you can do incremental backups of a directory on a FS off-site using rsync, but you can’t do this
when everything is stored in a database that could grow to be 100’s of GB or TB’s due to network bandwidth limitations.

The main issue with storing files separately (whether on a filesystem or in a distributed NoSQL type database) is that referential integrity and auditing is harder to achieve. Typically this would require a file path or URL to the file
to be stored as a simple observation in the database, but it’s possible that files can get deleted or changed without this being reflected in the record in the database. SQL Server 2008 allows you to have a file pointer as a datatype, but not MySQL or PostgreSQL,
which is what OpenMRS uses.

I look forward to hearing other people’s thoughts on this.

Kind regards,

Kari

On Wednesday, August 14, 2013 9:38:45 AM UTC+2, Ryan Crichton wrote:

Hi all,

Over the last few months we have been reviewing and tools that could be used as a SHR and have produced a document (link ) that
explains the conclusions of this review and explains our recommendation. On the community call yesterday we came to the decision that we are happy with the recommendation that is set out in the document and we should move forward with it. To summaries this
recommendation is as follows:

OpenMRS is the tool we believe we can have the most success with and we believe it will provide a good base for creating a SHR. We recommend that we move forward with OpenMRS as the tool to use to create a SHR for the needs of OpenHIE.

We also note that there is the RAMRS tool that Regenstrief have developed that also could act as a SHR and this will be taken forward by the Regenstrief team in a separate track to see if it could fit the role to. However, the focus of
this community will be to move forward with OpenMRS as the technology choice for the SHR.

If you were not on the call and have specific comments that you would like to add or concerns you would like to raise, please do so. We’d be happy to hear them.

Thanks all for the hard work in getting to this point.

Cheers,

Ryan

–

Ryan Crichton

Software Developer, Jembi Health Systems | SOUTH AFRICA

Mobile:
+27845829934 | Skype: ryan.graham.crichton
E-mail: ry...@jembi.org

–
You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.
To unsubscribe from this group and stop receiving emails from it, send an email to
openhie-shr+unsubscribe@googlegroups.com.
For more options, visit
https://groups.google.com/groups/opt_out.

–
Carl Fourie*
Assistant Director of Programs, Jembi Health Systems | SOUTH AFRICA
Mobile:
+27 71 540 4477 | Office:
+27 21 701 0939 | Skype: carl.fourie17
E-mail: carl@jembi.org*

–
You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.
To unsubscribe from this group and stop receiving emails from it, send an email to
openhie-shr+unsubscribe@googlegroups.com.
For more options, visit
https://groups.google.com/groups/opt_out.

–

Ryan Crichton

Software Developer, Jembi Health Systems | SOUTH AFRICA

Mobile: +27845829934 | Skype: ryan.graham.crichton
E-mail: ryan@jembi.org

–
You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.
To unsubscribe from this group and stop receiving emails from it, send an email to
openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Kari_Schoonbee · August 20, 2013, 2:45pm

@Mark I think with "documents" we are referring to anything that is not structured data. So this could be images, audio recordings, video recordings, scanned documents or text documents.
I agree that storing text documents in the RDB is probably fine from a backup/performance point of view, but if we are going to implement a separate store for other types of documents anyway, is there a reason for not including them in that store?

I'd be interested to know more about your experience with storing pointers to files in the RDB. I was thinking about using something like Amazon S3 that allows you to just store an URL to a document. Riak CS is open source and implements the S3 API. My concerns were mainly around keeping the two stores in sync and ensuring that they are only accessible through the OpenSHR to ensure that the one is never updated without the other.

As for PACS, scanned images, DICOM… I'm not sure we should distinguish. I think we should just enable someone to get whatever they store in the SHR back out. If a PACS system wants to sync to the SHR, we don't even need to know that they're doing PACS.

@Ryan I know that OpenXDS also uses relational databases, I wonder how that scales and what backup strategies are used where it is implemented.

Kari Schoonbee
Software Engineer - Jembi Health Systems | SOUTH AFRICA
Mobile: +27 83 488 3025 | Office: +27 21 701 0939
E-mail: kari@jembi.org

···

On 20 Aug 2013, at 3:42 PM, "Tucker, Mark" <mtucker2@regenstrief.org> wrote:

We need to be careful about whether we are talking about Images or Documents. And, within Images, we should distinguish PACS vs Scanned images.

We (Regenstrief/Indiana) currently store decades of documents in our database.
The text documents fit in easily, and are snappy.
The rub happens when they are scanned documents …. They really start feeling like images, and you start thinking of out-of-line storage. We certainly do not store Radiology images ….Our architecture includes a box labeled PACS to handle them.

I would strongly prefer to store text reports within the RDB. Most text documents are very small, and I would think any EMR worth its salt already supports documents. From an architectural point of view those EMRS, when acting as SHR’ handle documents internally.

We also have plenty of good experience storing pointers to out-of-line images within the RDB.
In our legacy system, the scanned documents (which occupy orders of magnitude less space than the PACS uses) are stored in a separate BLOB table, outside of the clinical discrete results, and outside of our text-report-body table.

In our new system, we mix the scanned images with text-report-bodies. I’m not thrilled with that, but it works.
(I like it separate because it is easier to manage smaller tables. Plus, by using a separate RDB table, it makes it easier to swap that out and swap in some sort of image store if the storage problem gets out of hand.)

From: openhie-shr@googlegroups.com [mailto:openhie-shr@googlegroups.com] On Behalf Of Ryan Crichton
Sent: Tuesday, August 20, 2013 3:12 AM
To: Carl Fourie
Cc: Kari Schoonbee; openhie-shr@googlegroups.com
Subject: Re: Our recommended tool to create a SHR: OpenMRS

Hi Kari, Carl,

Thanks for raising these issues. This is definitely the right time to be delving into some of the more design specific details of the SHR. Our focus for the coming calls will be to discuss these issues in detail and produce design documentation describing our feel for the best way to approach these issues.

I agree storing documents in a relation database may not be the most ideal. It would be interesting to explore how OpenXDS manages documents, that could help us in identifying a direction that is known to work.

Cheers,
Ryan

On Fri, Aug 16, 2013 at 1:18 PM, Carl Fourie <carl@jembi.org> wrote:
Thanks for raising this issue Kari (and for the offline discussion). I think these are great issues to think of addressing as part of the design concepts as we move forward. Looking forward to seeing the rest of the group weigh in too.

On a first pass my thought is that we need to decide at a high level the best method to handle document data types (my preference is a document based db -- my bias is now out there) and then figure out the technical challenges in this. As you have raised there are concerns around the management of referencial integrity between the two as well as the ability to audit too. Personally I think we can design a solution that allows us to mitigate it but I would love to hear the larger groups thoughts on this too.

Cheers,
Carl

On Fri, Aug 16, 2013 at 11:44 AM, Kari Schoonbee <kari@jembi.org> wrote:
Thanks Ryan, I think OpenMRS is definitely a good choice and I feel confident that we can build a great SHR tool around it.

I wonder if we are ready to start delving a little bit more into the design of the SHR around OpenMRS now? As stated in the recommendation documents, the largest issue we face around OpenMRS is the storing of document-based data. I've been reading up on the pro's and con's of storing blobs in relational databases and think this will make for quite an interesting discussion.

Personally I feel we should NOT be storing BLOBS (images etc) directly in the obs table. This could lead the database to grow in size very quickly and from what I gather it will lead to much slower performance, as the database is usually the bottleneck. It also consumes more space than storing files directly on the filesystem and it makes your backup process much more cumbersome. For example, you can do incremental backups of a directory on a FS off-site using rsync, but you can't do this when everything is stored in a database that could grow to be 100's of GB or TB's due to network bandwidth limitations.

The main issue with storing files separately (whether on a filesystem or in a distributed NoSQL type database) is that referential integrity and auditing is harder to achieve. Typically this would require a file path or URL to the file to be stored as a simple observation in the database, but it's possible that files can get deleted or changed without this being reflected in the record in the database. SQL Server 2008 allows you to have a file pointer as a datatype, but not MySQL or PostgreSQL, which is what OpenMRS uses.

I look forward to hearing other people's thoughts on this.

Kind regards,
Kari

On Wednesday, August 14, 2013 9:38:45 AM UTC+2, Ryan Crichton wrote:
Hi all,

Over the last few months we have been reviewing and tools that could be used as a SHR and have produced a document (link) that explains the conclusions of this review and explains our recommendation. On the community call yesterday we came to the decision that we are happy with the recommendation that is set out in the document and we should move forward with it. To summaries this recommendation is as follows:

OpenMRS is the tool we believe we can have the most success with and we believe it will provide a good base for creating a SHR. We recommend that we move forward with OpenMRS as the tool to use to create a SHR for the needs of OpenHIE.

We also note that there is the RAMRS tool that Regenstrief have developed that also could act as a SHR and this will be taken forward by the Regenstrief team in a separate track to see if it could fit the role to. However, the focus of this community will be to move forward with OpenMRS as the technology choice for the SHR.

If you were not on the call and have specific comments that you would like to add or concerns you would like to raise, please do so. We'd be happy to hear them.

Thanks all for the hard work in getting to this point.

Cheers,
Ryan

--
Ryan Crichton
Software Developer, Jembi Health Systems | SOUTH AFRICA
Mobile: +27845829934 | Skype: ryan.graham.crichton
E-mail: ry...@jembi.org
--
You received this message because you are subscribed to the Google Groups "Shared Health Record (OpenHIE)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Carl Fourie
Assistant Director of Programs, Jembi Health Systems | SOUTH AFRICA
Mobile: +27 71 540 4477 | Office: +27 21 701 0939 | Skype: carl.fourie17
E-mail: carl@jembi.org

--
You received this message because you are subscribed to the Google Groups "Shared Health Record (OpenHIE)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Ryan Crichton
Software Developer, Jembi Health Systems | SOUTH AFRICA
Mobile: +27845829934 | Skype: ryan.graham.crichton
E-mail: ryan@jembi.org
--
You received this message because you are subscribed to the Google Groups "Shared Health Record (OpenHIE)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "Shared Health Record (OpenHIE)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Odysseas_Pentakalos · August 20, 2013, 2:54pm

I have a little background with OpenXDS so, I'll shed some light on its design to save you the time of reading through docs and code.

OpenXDS was designed with an abstraction around the persistence of the documents. The document metadata is stored in a relational database but the document itself is passed
on to the abstraction of a persistence layer for handling. The default implementation stores the document onto the file system and essentially stores a pointer to the file in the relational
database along with the other metadata. The thinking was that implementers can use their existing infrastructure for storing the documents if they would prefer that over the filesystem; doing so, would require them to develop an adapter that implements the specifics of their infrastructure for persistence of the document.

Essentially, using OpenXDS you could store the documents in a filesystem without doing any additional work or move them to a database or the cloud (such as EC2) by simply implementing an adapter that handles the actual persistence.

Best regards,
Odysseas

···

On 08/20/2013 10:45 AM, Kari Schoonbee wrote:

@Mark I think with "documents" we are referring to anything that is not structured data. So this could be images, audio recordings, video recordings, scanned documents or text documents.
I agree that storing text documents in the RDB is probably fine from a backup/performance point of view, but if we are going to implement a separate store for other types of documents anyway, is there a reason for not including them in that store?

I'd be interested to know more about your experience with storing pointers to files in the RDB. I was thinking about using something like Amazon S3 that allows you to just store an URL to a document. Riak CS is open source and implements the S3 API. My concerns were mainly around keeping the two stores in sync and ensuring that they are only accessible through the OpenSHR to ensure that the one is never updated without the other.

As for PACS, scanned images, DICOMï¿½ I'm not sure we should distinguish. I think we should just enable someone to get whatever they store in the SHR back out. If a PACS system wants to sync to the SHR, we don't even need to know that they're doing PACS.

@Ryan I know that OpenXDS also uses relational databases, I wonder how that scales and what backup strategies are used where it is implemented.

Kari Schoonbee
Software Engineer - Jembi Health Systems | SOUTH AFRICA
Mobile: +27 83 488 3025 | Office: +27 21 701 0939
E-mail: kari@jembi.org

On 20 Aug 2013, at 3:42 PM, "Tucker, Mark" <mtucker2@regenstrief.org> wrote:

We need to be careful about whether we are talking about Images or Documents. And, within Images, we should distinguish PACS vs Scanned images.
  We (Regenstrief/Indiana) currently store decades of documents in our database.
The text documents fit in easily, and are snappy.
The rub happens when they are scanned documents ï¿½. They really start feeling like images, and you start thinking of out-of-line storage. We certainly do not store Radiology images ï¿½.Our architecture includes a box labeled PACS to handle them.
  I would strongly prefer to store text reports within the RDB. Most text documents are very small, and I would think any EMR worth its salt already supports documents. From an architectural point of view those EMRS, when acting as SHRï¿½ handle documents internally.
  We also have plenty of good experience storing pointers to out-of-line images within the RDB.
In our legacy system, the scanned documents (which occupy orders of magnitude less space than the PACS uses) are stored in a separate BLOB table, outside of the clinical discrete results, and outside of our text-report-body table.
  In our new system, we mix the scanned images with text-report-bodies. Iï¿½m not thrilled with that, but it works.
(I like it separate because it is easier to manage smaller tables. Plus, by using a separate RDB table, it makes it easier to swap that out and swap in some sort of image store if the storage problem gets out of hand.)
  From: openhie-shr@googlegroups.com [mailto:openhie-shr@googlegroups.com] On Behalf Of Ryan Crichton
Sent: Tuesday, August 20, 2013 3:12 AM
To: Carl Fourie
Cc: Kari Schoonbee; openhie-shr@googlegroups.com
Subject: Re: Our recommended tool to create a SHR: OpenMRS
  Hi Kari, Carl,
  Thanks for raising these issues. This is definitely the right time to be delving into some of the more design specific details of the SHR. Our focus for the coming calls will be to discuss these issues in detail and produce design documentation describing our feel for the best way to approach these issues.
  I agree storing documents in a relation database may not be the most ideal. It would be interesting to explore how OpenXDS manages documents, that could help us in identifying a direction that is known to work.
  Cheers,
Ryan

On Fri, Aug 16, 2013 at 1:18 PM, Carl Fourie <carl@jembi.org> wrote:
Thanks for raising this issue Kari (and for the offline discussion). I think these are great issues to think of addressing as part of the design concepts as we move forward. Looking forward to seeing the rest of the group weigh in too.
  On a first pass my thought is that we need to decide at a high level the best method to handle document data types (my preference is a document based db -- my bias is now out there) and then figure out the technical challenges in this. As you have raised there are concerns around the management of referencial integrity between the two as well as the ability to audit too. Personally I think we can design a solution that allows us to mitigate it but I would love to hear the larger groups thoughts on this too.
  Cheers,
Carl

On Fri, Aug 16, 2013 at 11:44 AM, Kari Schoonbee <kari@jembi.org> wrote:
Thanks Ryan, I think OpenMRS is definitely a good choice and I feel confident that we can build a great SHR tool around it.
  I wonder if we are ready to start delving a little bit more into the design of the SHR around OpenMRS now? As stated in the recommendation documents, the largest issue we face around OpenMRS is the storing of document-based data. I've been reading up on the pro's and con's of storing blobs in relational databases and think this will make for quite an interesting discussion.
  Personally I feel we should NOT be storing BLOBS (images etc) directly in the obs table. This could lead the database to grow in size very quickly and from what I gather it will lead to much slower performance, as the database is usually the bottleneck. It also consumes more space than storing files directly on the filesystem and it makes your backup process much more cumbersome. For example, you can do incremental backups of a directory on a FS off-site using rsync, but you can't do this when everything is stored in a database that could grow to be 100's of GB or TB's due to network bandwidth limitations.
  The main issue with storing files separately (whether on a filesystem or in a distributed NoSQL type database) is that referential integrity and auditing is harder to achieve. Typically this would require a file path or URL to the file to be stored as a simple observation in the database, but it's possible that files can get deleted or changed without this being reflected in the record in the database. SQL Server 2008 allows you to have a file pointer as a datatype, but not MySQL or PostgreSQL, which is what OpenMRS uses.
  I look forward to hearing other people's thoughts on this.
  Kind regards,
Kari
  On Wednesday, August 14, 2013 9:38:45 AM UTC+2, Ryan Crichton wrote:
Hi all,
  Over the last few months we have been reviewing and tools that could be used as a SHR and have produced a document (link) that explains the conclusions of this review and explains our recommendation. On the community call yesterday we came to the decision that we are happy with the recommendation that is set out in the document and we should move forward with it. To summaries this recommendation is as follows:
  OpenMRS is the tool we believe we can have the most success with and we believe it will provide a good base for creating a SHR. We recommend that we move forward with OpenMRS as the tool to use to create a SHR for the needs of OpenHIE.
  We also note that there is the RAMRS tool that Regenstrief have developed that also could act as a SHR and this will be taken forward by the Regenstrief team in a separate track to see if it could fit the role to. However, the focus of this community will be to move forward with OpenMRS as the technology choice for the SHR.
  If you were not on the call and have specific comments that you would like to add or concerns you would like to raise, please do so. We'd be happy to hear them.
  Thanks all for the hard work in getting to this point.
  Cheers,
Ryan
  --
Ryan Crichton
Software Developer, Jembi Health Systems | SOUTH AFRICA
Mobile: +27845829934 | Skype: ryan.graham.crichton
E-mail: ry...@jembi.org
--
You received this message because you are subscribed to the Google Groups "Shared Health Record (OpenHIE)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Carl Fourie
Assistant Director of Programs, Jembi Health Systems | SOUTH AFRICA
Mobile: +27 71 540 4477 | Office: +27 21 701 0939 | Skype: carl.fourie17
E-mail: carl@jembi.org

--
You received this message because you are subscribed to the Google Groups "Shared Health Record (OpenHIE)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

  --
Ryan Crichton
Software Developer, Jembi Health Systems | SOUTH AFRICA
Mobile: +27845829934 | Skype: ryan.graham.crichton
E-mail: ryan@jembi.org
--
You received this message because you are subscribed to the Google Groups "Shared Health Record (OpenHIE)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "Shared Health Record (OpenHIE)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Odysseas Pentakalos, Ph.D., PMP
Chief Technology Officer
SYSNET International, Inc.
2930 Oak Shadow Drive
Oak Hill, Virginia 20171
mailto:odysseas@sysnetint.com
(703) 855-2029

dritz · August 20, 2013, 3:10pm

Hi all.

I think we should take to heart that, for our purposes, we will need to
expect all CDA "documents" to be level 3, meaning they are structured (HL7v3
RIM-conformant) XML containers. This way, we can take some comfort that
we're not walking away from any of the benefits we currently enjoy with our
V2 messages and the fact that those can be readily mapped to database
fields. Level 3 CDAs can be mapped to database fields.

My $0.02...

DJ

Derek Ritz, P.Eng., CPHIMS-CA
ecGroup Inc.
+1 (905) 515-0045
www.ecgroupinc.com

This communication is intended only for the party to whom it is addressed,
and may contain information which is privileged or confidential. Any other
delivery, distribution, copying or disclosure is strictly prohibited and is
not a waiver of privilege or confidentiality. If you have received this
telecommunication in error, please notify the sender immediately by return
electronic mail and destroy the message and any attachments.

···

----------------------------------------------------------------------------
----
Le présent courriel et les documents qui y sont joints sont confidentiels et
protégés et s'adressent exclusivement au destinataire mentionné ci-dessus.
L'expéditeur ne renonce pas aux droits et privilèges qui s'y rapportent ni à
leur caractère confidentiel. Toute prise de connaissance, diffusion,
utilisation ou reproduction de ce message ou des documents qui y sont
joints, ainsi que des renseignements que chacun contient, par une personne
autre que le destinataire prévu est interdite. Si vous recevez ce courriel
par erreur, veuillez le détruire immédiatement et m'en informer.

-----Original Message-----
From: openhie-shr@googlegroups.com [mailto:openhie-shr@googlegroups.com] On
Behalf Of Kari Schoonbee
Sent: August 20, 2013 10:45 AM
To: Tucker, Mark
Cc: openhie-shr@googlegroups.com
Subject: Re: BLOBS vs DOCUMENTS in the SHR

@Mark I think with "documents" we are referring to anything that is not
structured data. So this could be images, audio recordings, video
recordings, scanned documents or text documents.
I agree that storing text documents in the RDB is probably fine from a
backup/performance point of view, but if we are going to implement a
separate store for other types of documents anyway, is there a reason for
not including them in that store?

I'd be interested to know more about your experience with storing pointers
to files in the RDB. I was thinking about using something like Amazon S3
that allows you to just store an URL to a document. Riak CS is open source
and implements the S3 API. My concerns were mainly around keeping the two
stores in sync and ensuring that they are only accessible through the
OpenSHR to ensure that the one is never updated without the other.

As for PACS, scanned images, DICOM I'm not sure we should distinguish. I
think we should just enable someone to get whatever they store in the SHR
back out. If a PACS system wants to sync to the SHR, we don't even need to
know that they're doing PACS.

@Ryan I know that OpenXDS also uses relational databases, I wonder how that
scales and what backup strategies are used where it is implemented.

Kari Schoonbee
Software Engineer - Jembi Health Systems | SOUTH AFRICA
Mobile: +27 83 488 3025 | Office: +27 21 701 0939
E-mail: kari@jembi.org

On 20 Aug 2013, at 3:42 PM, "Tucker, Mark" <mtucker2@regenstrief.org> wrote:

We need to be careful about whether we are talking about Images or

Documents. And, within Images, we should distinguish PACS vs Scanned
images.

We (Regenstrief/Indiana) currently store decades of documents in our

database.

The text documents fit in easily, and are snappy.
The rub happens when they are scanned documents . They really start

feeling like images, and you start thinking of out-of-line storage. We
certainly do not store Radiology images .Our architecture includes a box
labeled PACS to handle them.

I would strongly prefer to store text reports within the RDB. Most text

documents are very small, and I would think any EMR worth its salt already
supports documents. From an architectural point of view those EMRS, when
acting as SHR handle documents internally.

We also have plenty of good experience storing pointers to out-of-line

images within the RDB.

In our legacy system, the scanned documents (which occupy orders of

magnitude less space than the PACS uses) are stored in a separate BLOB
table, outside of the clinical discrete results, and outside of our
text-report-body table.

In our new system, we mix the scanned images with text-report-bodies. Im

not thrilled with that, but it works.

(I like it separate because it is easier to manage smaller tables. Plus,

by using a separate RDB table, it makes it easier to swap that out and swap
in some sort of image store if the storage problem gets out of hand.)

From: openhie-shr@googlegroups.com [mailto:openhie-shr@googlegroups.com]

On Behalf Of Ryan Crichton

Sent: Tuesday, August 20, 2013 3:12 AM
To: Carl Fourie
Cc: Kari Schoonbee; openhie-shr@googlegroups.com
Subject: Re: Our recommended tool to create a SHR: OpenMRS

Hi Kari, Carl,

Thanks for raising these issues. This is definitely the right time to be

delving into some of the more design specific details of the SHR. Our focus
for the coming calls will be to discuss these issues in detail and produce
design documentation describing our feel for the best way to approach these
issues.

I agree storing documents in a relation database may not be the most

ideal. It would be interesting to explore how OpenXDS manages documents,
that could help us in identifying a direction that is known to work.

Cheers,
Ryan

On Fri, Aug 16, 2013 at 1:18 PM, Carl Fourie <carl@jembi.org> wrote:
Thanks for raising this issue Kari (and for the offline discussion). I

think these are great issues to think of addressing as part of the design
concepts as we move forward. Looking forward to seeing the rest of the group
weigh in too.

On a first pass my thought is that we need to decide at a high level the

best method to handle document data types (my preference is a document based
db -- my bias is now out there) and then figure out the technical challenges
in this. As you have raised there are concerns around the management of
referencial integrity between the two as well as the ability to audit too.
Personally I think we can design a solution that allows us to mitigate it
but I would love to hear the larger groups thoughts on this too.

Cheers,
Carl

On Fri, Aug 16, 2013 at 11:44 AM, Kari Schoonbee <kari@jembi.org> wrote:
Thanks Ryan, I think OpenMRS is definitely a good choice and I feel

confident that we can build a great SHR tool around it.

I wonder if we are ready to start delving a little bit more into the

design of the SHR around OpenMRS now? As stated in the recommendation
documents, the largest issue we face around OpenMRS is the storing of
document-based data. I've been reading up on the pro's and con's of storing
blobs in relational databases and think this will make for quite an
interesting discussion.

Personally I feel we should NOT be storing BLOBS (images etc) directly in

the obs table. This could lead the database to grow in size very quickly and
from what I gather it will lead to much slower performance, as the database
is usually the bottleneck. It also consumes more space than storing files
directly on the filesystem and it makes your backup process much more
cumbersome. For example, you can do incremental backups of a directory on a
FS off-site using rsync, but you can't do this when everything is stored in
a database that could grow to be 100's of GB or TB's due to network
bandwidth limitations.

The main issue with storing files separately (whether on a filesystem or

in a distributed NoSQL type database) is that referential integrity and
auditing is harder to achieve. Typically this would require a file path or
URL to the file to be stored as a simple observation in the database, but
it's possible that files can get deleted or changed without this being
reflected in the record in the database. SQL Server 2008 allows you to have
a file pointer as a datatype, but not MySQL or PostgreSQL, which is what
OpenMRS uses.

I look forward to hearing other people's thoughts on this.

Kind regards,
Kari

On Wednesday, August 14, 2013 9:38:45 AM UTC+2, Ryan Crichton wrote:
Hi all,

Over the last few months we have been reviewing and tools that could be

used as a SHR and have produced a document (link) that explains the
conclusions of this review and explains our recommendation. On the community
call yesterday we came to the decision that we are happy with the
recommendation that is set out in the document and we should move forward
with it. To summaries this recommendation is as follows:

OpenMRS is the tool we believe we can have the most success with and we

believe it will provide a good base for creating a SHR. We recommend that we
move forward with OpenMRS as the tool to use to create a SHR for the needs
of OpenHIE.

We also note that there is the RAMRS tool that Regenstrief have developed

that also could act as a SHR and this will be taken forward by the
Regenstrief team in a separate track to see if it could fit the role to.
However, the focus of this community will be to move forward with OpenMRS as
the technology choice for the SHR.

If you were not on the call and have specific comments that you would like

to add or concerns you would like to raise, please do so. We'd be happy to
hear them.

Thanks all for the hard work in getting to this point.

Cheers,
Ryan

--
Ryan Crichton
Software Developer, Jembi Health Systems | SOUTH AFRICA
Mobile: +27845829934 | Skype: ryan.graham.crichton
E-mail: ry...@jembi.org
--
You received this message because you are subscribed to the Google Groups

"Shared Health Record (OpenHIE)" group.

To unsubscribe from this group and stop receiving emails from it, send an

email to openhie-shr+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
Carl Fourie
Assistant Director of Programs, Jembi Health Systems | SOUTH AFRICA
Mobile: +27 71 540 4477 | Office: +27 21 701 0939 | Skype: carl.fourie17
E-mail: carl@jembi.org

--
You received this message because you are subscribed to the Google Groups

"Shared Health Record (OpenHIE)" group.

To unsubscribe from this group and stop receiving emails from it, send an

email to openhie-shr+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
Ryan Crichton
Software Developer, Jembi Health Systems | SOUTH AFRICA
Mobile: +27845829934 | Skype: ryan.graham.crichton
E-mail: ryan@jembi.org
--
You received this message because you are subscribed to the Google Groups

"Shared Health Record (OpenHIE)" group.

To unsubscribe from this group and stop receiving emails from it, send an

email to openhie-shr+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups

"Shared Health Record (OpenHIE)" group.

To unsubscribe from this group and stop receiving emails from it, send an

email to openhie-shr+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"Shared Health Record (OpenHIE)" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Tucker_Mark · August 20, 2013, 8:41pm

@Kari,
I'm glad you are interested in images.!

Our approach is that a document is either local (ie, a text document), or it is a "remote image".
A remote image is an abstraction of a url

  access_method:handle
  legato:13.2312.342.23
  synapse:322.99292
  local_image:389892

We then have a mapping from access method, to handler.

"legato" maps to an ImageFetcher routine (in java) that knows how to deference images from the external "Legato" document handling system.

Similarly, the handler for "synapse" knows that the handle is actually a DICOM oid that is known by the synapse PACS system.

We do not remember host names/etc, in the handle. So, if Synapse moves from machine to machine, we don't change the pointers.

The "local_image" handler just says that the contents are in a special LOCAL_IMAGE_BODY table, and not in the same table as the text-report documents.
We have been waiting for an excuse to move the LOCAL_IMAGE_BODY's into the file system, but haven't done it yet.

[[Why do I want text-document bodies to live in the RDB ?]] Because, absent other text searching, I want to be able to use SQL to peek into the content. Furthermore, our fastest displays render the REPORT-TEXT bodies inline (so one "html page" may show hundreds of document bodies, so that clinicians can quickly scroll through and glance at the contents.

The reports whose bodies have "remote images" are render as "See attached document [>ICON GOES HERE<]", and the user must click on it to see it. Often, our image fetcher doesn't even ever fetch the bits. It may just render an page that points the user off to the remote system itself. For example the "synapse" fetcher just sends them off (with AUTO_LOGIN) to the remote PACS system, while the "onbase" fetch fetches the bits from the remote system, and then presents them directly to the user (who is unaware that we went off-site to find his bits!)

[[Why do I make a distinction between Radiology and Scanned Images, and "attachments" ??]]

To me, it is a matter of engineering scale. I am happy to assume that there is some big fat PACS system out there, and its terabytes of data are its problem. I am happy to know how my tiny RDB pointers refer over to the PACS.

Similarly, a heavy duty document scanning system will have its own massive data store (and backup machinery), and I'm happy to point into it. And, extra points to vendors who give me an API to "fetch the bits", and also give me a user-interface API that supports single sign on .... in case the users want to go over to a higher performance interface.

By "attachments", I mean a much smaller scale scanning or PDF problem, in which a small volume of docuements come over in PDF format, or come in in PDF + flat text, or flat text + handful.of.key.images.

In this case, I'll store the data part locally (LOCAL_IMAGE_BODY, or in my filesystem, or in a very_light_weight NoSQL system..)

... Mark

···

-----Original Message-----
From: Kari Schoonbee [mailto:kari@jembi.org]
Sent: Tuesday, August 20, 2013 10:45 AM
To: Tucker, Mark
Cc: openhie-shr@googlegroups.com
Subject: Re: BLOBS vs DOCUMENTS in the SHR

@Mark I think with "documents" we are referring to anything that is not structured data. So this could be images, audio recordings, video recordings, scanned documents or text documents.
I agree that storing text documents in the RDB is probably fine from a backup/performance point of view, but if we are going to implement a separate store for other types of documents anyway, is there a reason for not including them in that store?

I'd be interested to know more about your experience with storing pointers to files in the RDB. I was thinking about using something like Amazon S3 that allows you to just store an URL to a document. Riak CS is open source and implements the S3 API. My concerns were mainly around keeping the two stores in sync and ensuring that they are only accessible through the OpenSHR to ensure that the one is never updated without the other.

As for PACS, scanned images, DICOM... I'm not sure we should distinguish. I think we should just enable someone to get whatever they store in the SHR back out. If a PACS system wants to sync to the SHR, we don't even need to know that they're doing PACS.

@Ryan I know that OpenXDS also uses relational databases, I wonder how that scales and what backup strategies are used where it is implemented.

Kari Schoonbee
Software Engineer - Jembi Health Systems | SOUTH AFRICA
Mobile: +27 83 488 3025 | Office: +27 21 701 0939
E-mail: kari@jembi.org

On 20 Aug 2013, at 3:42 PM, "Tucker, Mark" <mtucker2@regenstrief.org> wrote:

We need to be careful about whether we are talking about Images or Documents. And, within Images, we should distinguish PACS vs Scanned images.

We (Regenstrief/Indiana) currently store decades of documents in our database.
The text documents fit in easily, and are snappy.
The rub happens when they are scanned documents .... They really start feeling like images, and you start thinking of out-of-line storage. We certainly do not store Radiology images ....Our architecture includes a box labeled PACS to handle them.

I would strongly prefer to store text reports within the RDB. Most text documents are very small, and I would think any EMR worth its salt already supports documents. From an architectural point of view those EMRS, when acting as SHR' handle documents internally.

We also have plenty of good experience storing pointers to out-of-line images within the RDB.
In our legacy system, the scanned documents (which occupy orders of magnitude less space than the PACS uses) are stored in a separate BLOB table, outside of the clinical discrete results, and outside of our text-report-body table.

In our new system, we mix the scanned images with text-report-bodies. I'm not thrilled with that, but it works.
(I like it separate because it is easier to manage smaller tables. Plus, by using a separate RDB table, it makes it easier to swap that out and swap in some sort of image store if the storage problem gets out of hand.)

From: openhie-shr@googlegroups.com [mailto:openhie-shr@googlegroups.com] On Behalf Of Ryan Crichton
Sent: Tuesday, August 20, 2013 3:12 AM
To: Carl Fourie
Cc: Kari Schoonbee; openhie-shr@googlegroups.com
Subject: Re: Our recommended tool to create a SHR: OpenMRS

Hi Kari, Carl,

Thanks for raising these issues. This is definitely the right time to be delving into some of the more design specific details of the SHR. Our focus for the coming calls will be to discuss these issues in detail and produce design documentation describing our feel for the best way to approach these issues.

I agree storing documents in a relation database may not be the most ideal. It would be interesting to explore how OpenXDS manages documents, that could help us in identifying a direction that is known to work.

Cheers,
Ryan

On Fri, Aug 16, 2013 at 1:18 PM, Carl Fourie <carl@jembi.org> wrote:
Thanks for raising this issue Kari (and for the offline discussion). I think these are great issues to think of addressing as part of the design concepts as we move forward. Looking forward to seeing the rest of the group weigh in too.

On a first pass my thought is that we need to decide at a high level the best method to handle document data types (my preference is a document based db -- my bias is now out there) and then figure out the technical challenges in this. As you have raised there are concerns around the management of referencial integrity between the two as well as the ability to audit too. Personally I think we can design a solution that allows us to mitigate it but I would love to hear the larger groups thoughts on this too.

Cheers,
Carl

On Fri, Aug 16, 2013 at 11:44 AM, Kari Schoonbee <kari@jembi.org> wrote:
Thanks Ryan, I think OpenMRS is definitely a good choice and I feel confident that we can build a great SHR tool around it.

I wonder if we are ready to start delving a little bit more into the design of the SHR around OpenMRS now? As stated in the recommendation documents, the largest issue we face around OpenMRS is the storing of document-based data. I've been reading up on the pro's and con's of storing blobs in relational databases and think this will make for quite an interesting discussion.

Personally I feel we should NOT be storing BLOBS (images etc) directly in the obs table. This could lead the database to grow in size very quickly and from what I gather it will lead to much slower performance, as the database is usually the bottleneck. It also consumes more space than storing files directly on the filesystem and it makes your backup process much more cumbersome. For example, you can do incremental backups of a directory on a FS off-site using rsync, but you can't do this when everything is stored in a database that could grow to be 100's of GB or TB's due to network bandwidth limitations.

The main issue with storing files separately (whether on a filesystem or in a distributed NoSQL type database) is that referential integrity and auditing is harder to achieve. Typically this would require a file path or URL to the file to be stored as a simple observation in the database, but it's possible that files can get deleted or changed without this being reflected in the record in the database. SQL Server 2008 allows you to have a file pointer as a datatype, but not MySQL or PostgreSQL, which is what OpenMRS uses.

I look forward to hearing other people's thoughts on this.

Kind regards,
Kari

On Wednesday, August 14, 2013 9:38:45 AM UTC+2, Ryan Crichton wrote:
Hi all,

Over the last few months we have been reviewing and tools that could be used as a SHR and have produced a document (link) that explains the conclusions of this review and explains our recommendation. On the community call yesterday we came to the decision that we are happy with the recommendation that is set out in the document and we should move forward with it. To summaries this recommendation is as follows:

OpenMRS is the tool we believe we can have the most success with and we believe it will provide a good base for creating a SHR. We recommend that we move forward with OpenMRS as the tool to use to create a SHR for the needs of OpenHIE.

We also note that there is the RAMRS tool that Regenstrief have developed that also could act as a SHR and this will be taken forward by the Regenstrief team in a separate track to see if it could fit the role to. However, the focus of this community will be to move forward with OpenMRS as the technology choice for the SHR.

If you were not on the call and have specific comments that you would like to add or concerns you would like to raise, please do so. We'd be happy to hear them.

Thanks all for the hard work in getting to this point.

Cheers,
Ryan

--
Ryan Crichton
Software Developer, Jembi Health Systems | SOUTH AFRICA
Mobile: +27845829934 | Skype: ryan.graham.crichton
E-mail: ry...@jembi.org
--
You received this message because you are subscribed to the Google Groups "Shared Health Record (OpenHIE)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Carl Fourie
Assistant Director of Programs, Jembi Health Systems | SOUTH AFRICA
Mobile: +27 71 540 4477 | Office: +27 21 701 0939 | Skype: carl.fourie17
E-mail: carl@jembi.org

--
You received this message because you are subscribed to the Google Groups "Shared Health Record (OpenHIE)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
Ryan Crichton
Software Developer, Jembi Health Systems | SOUTH AFRICA
Mobile: +27845829934 | Skype: ryan.graham.crichton
E-mail: ryan@jembi.org
--
You received this message because you are subscribed to the Google Groups "Shared Health Record (OpenHIE)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "Shared Health Record (OpenHIE)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

ryan · August 21, 2013, 11:51am

Hi all,

Great discussion.

I think one of the key things that we should take away from this discussion is that there will need to be different mechanisms to handle different sorts of the document. Some documents are unstructured and some are structured. Derek makes it clear that there are 3 levels of CDA. CDA level 3 is completely structured and can be decomposed and stored in a relation database as concepts. CDA level 1 on the other hand is completely unstructured and we will need a mechanism to store this ‘as is’, for example as a BLOB or on the file system. Then there is CDA level 2 that is a combination of the 2 where we will likely need to be able to link structured and unstructured content under the same document.

So for the SHR I see the following things that we will need to support when consider documents:

Support storing and querying of document metadata
Decomposing structured documents into a relational database (OpenMRS would be very good at this due to its concept dictionary)
Storing unstructured documents using a suitable mechanism (OpenMRS doesn’t have this mechanism, we will need to build it)
Decompose a document into is structured and unstructured parts and store those each in their appropriate mechanism with a link between them. (We will need to build the link mechanism in OpenMRS)
We will likely need to store link to remote documents in the case of PACS or other large images/video/audio. (I don’t think we have to concern ourselves with actually fetching these remote documents as that could be left up to an OpenHIM mediator or the client)
Cheers,

Ryan

···

On Tue, Aug 20, 2013 at 10:41 PM, Tucker, Mark mtucker2@regenstrief.org wrote:

@Kari,

I’m glad you are interested in images.!

Our approach is that a document is either local (ie, a text document), or it is a “remote image”.

A remote image is an abstraction of a url
    access_method:handle

    legato:13.2312.342.23

    synapse:322.99292

    local_image:389892
We then have a mapping from access method, to handler.

“legato” maps to an ImageFetcher routine (in java) that knows how to deference images from the external “Legato” document handling system.

Similarly, the handler for “synapse” knows that the handle is actually a DICOM oid that is known by the synapse PACS system.

We do not remember host names/etc, in the handle. So, if Synapse moves from machine to machine, we don’t change the pointers.

The “local_image” handler just says that the contents are in a special LOCAL_IMAGE_BODY table, and not in the same table as the text-report documents.

We have been waiting for an excuse to move the LOCAL_IMAGE_BODY’s into the file system, but haven’t done it yet.

[[Why do I want text-document bodies to live in the RDB ?]] Because, absent other text searching, I want to be able to use SQL to peek into the content. Furthermore, our fastest displays render the REPORT-TEXT bodies inline (so one “html page” may show hundreds of document bodies, so that clinicians can quickly scroll through and glance at the contents.

The reports whose bodies have “remote images” are render as “See attached document [>ICON GOES HERE<]”, and the user must click on it to see it. Often, our image fetcher doesn’t even ever fetch the bits. It may just render an page that points the user off to the remote system itself. For example the “synapse” fetcher just sends them off (with AUTO_LOGIN) to the remote PACS system, while the “onbase” fetch fetches the bits from the remote system, and then presents them directly to the user (who is unaware that we went off-site to find his bits!)

[[Why do I make a distinction between Radiology and Scanned Images, and “attachments” ??]]

To me, it is a matter of engineering scale. I am happy to assume that there is some big fat PACS system out there, and its terabytes of data are its problem. I am happy to know how my tiny RDB pointers refer over to the PACS.

Similarly, a heavy duty document scanning system will have its own massive data store (and backup machinery), and I’m happy to point into it. And, extra points to vendors who give me an API to “fetch the bits”, and also give me a user-interface API that supports single sign on … in case the users want to go over to a higher performance interface.

By “attachments”, I mean a much smaller scale scanning or PDF problem, in which a small volume of docuements come over in PDF format, or come in in PDF + flat text, or flat text + handful.of.key.images.

In this case, I’ll store the data part locally (LOCAL_IMAGE_BODY, or in my filesystem, or in a very_light_weight NoSQL system…)

… Mark

-----Original Message-----

From: Kari Schoonbee [mailto:kari@jembi.org]

Sent: Tuesday, August 20, 2013 10:45 AM

To: Tucker, Mark

Cc: openhie-shr@googlegroups.com

Subject: Re: BLOBS vs DOCUMENTS in the SHR

@Mark I think with “documents” we are referring to anything that is not structured data. So this could be images, audio recordings, video recordings, scanned documents or text documents.

I agree that storing text documents in the RDB is probably fine from a backup/performance point of view, but if we are going to implement a separate store for other types of documents anyway, is there a reason for not including them in that store?

I’d be interested to know more about your experience with storing pointers to files in the RDB. I was thinking about using something like Amazon S3 that allows you to just store an URL to a document. Riak CS is open source and implements the S3 API. My concerns were mainly around keeping the two stores in sync and ensuring that they are only accessible through the OpenSHR to ensure that the one is never updated without the other.

As for PACS, scanned images, DICOM… I’m not sure we should distinguish. I think we should just enable someone to get whatever they store in the SHR back out. If a PACS system wants to sync to the SHR, we don’t even need to know that they’re doing PACS.

@Ryan I know that OpenXDS also uses relational databases, I wonder how that scales and what backup strategies are used where it is implemented.

Kari Schoonbee

Software Engineer - Jembi Health Systems | SOUTH AFRICA

Mobile: +27 83 488 3025 | Office: +27 21 701 0939

E-mail: kari@jembi.org

On 20 Aug 2013, at 3:42 PM, “Tucker, Mark” mtucker2@regenstrief.org wrote:

We need to be careful about whether we are talking about Images or Documents. And, within Images, we should distinguish PACS vs Scanned images.

We (Regenstrief/Indiana) currently store decades of documents in our database.

The text documents fit in easily, and are snappy.

The rub happens when they are scanned documents … They really start feeling like images, and you start thinking of out-of-line storage. We certainly do not store Radiology images …Our architecture includes a box labeled PACS to handle them.

I would strongly prefer to store text reports within the RDB. Most text documents are very small, and I would think any EMR worth its salt already supports documents. From an architectural point of view those EMRS, when acting as SHR’ handle documents internally.

We also have plenty of good experience storing pointers to out-of-line images within the RDB.

In our legacy system, the scanned documents (which occupy orders of magnitude less space than the PACS uses) are stored in a separate BLOB table, outside of the clinical discrete results, and outside of our text-report-body table.

In our new system, we mix the scanned images with text-report-bodies. I’m not thrilled with that, but it works.

(I like it separate because it is easier to manage smaller tables. Plus, by using a separate RDB table, it makes it easier to swap that out and swap in some sort of image store if the storage problem gets out of hand.)

From: openhie-shr@googlegroups.com [mailto:openhie-shr@googlegroups.com] On Behalf Of Ryan Crichton

Sent: Tuesday, August 20, 2013 3:12 AM

To: Carl Fourie

Cc: Kari Schoonbee; openhie-shr@googlegroups.com

Subject: Re: Our recommended tool to create a SHR: OpenMRS

Hi Kari, Carl,

Thanks for raising these issues. This is definitely the right time to be delving into some of the more design specific details of the SHR. Our focus for the coming calls will be to discuss these issues in detail and produce design documentation describing our feel for the best way to approach these issues.

I agree storing documents in a relation database may not be the most ideal. It would be interesting to explore how OpenXDS manages documents, that could help us in identifying a direction that is known to work.

Cheers,

Ryan

On Fri, Aug 16, 2013 at 1:18 PM, Carl Fourie carl@jembi.org wrote:

Thanks for raising this issue Kari (and for the offline discussion). I think these are great issues to think of addressing as part of the design concepts as we move forward. Looking forward to seeing the rest of the group weigh in too.

On a first pass my thought is that we need to decide at a high level the best method to handle document data types (my preference is a document based db – my bias is now out there) and then figure out the technical challenges in this. As you have raised there are concerns around the management of referencial integrity between the two as well as the ability to audit too. Personally I think we can design a solution that allows us to mitigate it but I would love to hear the larger groups thoughts on this too.

Cheers,

Carl

On Fri, Aug 16, 2013 at 11:44 AM, Kari Schoonbee kari@jembi.org wrote:

Thanks Ryan, I think OpenMRS is definitely a good choice and I feel confident that we can build a great SHR tool around it.

I wonder if we are ready to start delving a little bit more into the design of the SHR around OpenMRS now? As stated in the recommendation documents, the largest issue we face around OpenMRS is the storing of document-based data. I’ve been reading up on the pro’s and con’s of storing blobs in relational databases and think this will make for quite an interesting discussion.

Personally I feel we should NOT be storing BLOBS (images etc) directly in the obs table. This could lead the database to grow in size very quickly and from what I gather it will lead to much slower performance, as the database is usually the bottleneck. It also consumes more space than storing files directly on the filesystem and it makes your backup process much more cumbersome. For example, you can do incremental backups of a directory on a FS off-site using rsync, but you can’t do this when everything is stored in a database that could grow to be 100’s of GB or TB’s due to network bandwidth limitations.

The main issue with storing files separately (whether on a filesystem or in a distributed NoSQL type database) is that referential integrity and auditing is harder to achieve. Typically this would require a file path or URL to the file to be stored as a simple observation in the database, but it’s possible that files can get deleted or changed without this being reflected in the record in the database. SQL Server 2008 allows you to have a file pointer as a datatype, but not MySQL or PostgreSQL, which is what OpenMRS uses.

I look forward to hearing other people’s thoughts on this.

Kind regards,

Kari

On Wednesday, August 14, 2013 9:38:45 AM UTC+2, Ryan Crichton wrote:

Hi all,

Over the last few months we have been reviewing and tools that could be used as a SHR and have produced a document (link) that explains the conclusions of this review and explains our recommendation. On the community call yesterday we came to the decision that we are happy with the recommendation that is set out in the document and we should move forward with it. To summaries this recommendation is as follows:

OpenMRS is the tool we believe we can have the most success with and we believe it will provide a good base for creating a SHR. We recommend that we move forward with OpenMRS as the tool to use to create a SHR for the needs of OpenHIE.

We also note that there is the RAMRS tool that Regenstrief have developed that also could act as a SHR and this will be taken forward by the Regenstrief team in a separate track to see if it could fit the role to. However, the focus of this community will be to move forward with OpenMRS as the technology choice for the SHR.

If you were not on the call and have specific comments that you would like to add or concerns you would like to raise, please do so. We’d be happy to hear them.

Thanks all for the hard work in getting to this point.

Cheers,

Ryan

–

Ryan Crichton

Software Developer, Jembi Health Systems | SOUTH AFRICA

Mobile: +27845829934 | Skype: ryan.graham.crichton

E-mail: ry...@jembi.org

–

You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.

To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

–

Carl Fourie

Assistant Director of Programs, Jembi Health Systems | SOUTH AFRICA

Mobile: +27 71 540 4477 | Office: +27 21 701 0939 | Skype: carl.fourie17

E-mail: carl@jembi.org

–

You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.

To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

–

Ryan Crichton

Software Developer, Jembi Health Systems | SOUTH AFRICA

Mobile: +27845829934 | Skype: ryan.graham.crichton

E-mail: ryan@jembi.org

–

You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.

To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

–

You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.

To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

–

You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.

To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

–
Ryan Crichton

Software Developer, Jembi Health Systems | SOUTH AFRICA

Mobile: +27845829934 | Skype: ryan.graham.crichton
E-mail: ryan@jembi.org

Steve_Ross-Talbot · November 17, 2016, 9:44am

Old thread I know, but I really need some sort of mechanism to store large things (voice notes and documents).

Did anyone sort out the document (or BLOB) storage issue with OpenMRS? And did it get remitted back into open source? If so where is that open source?

Thanks

Steve T

···

On Tuesday, August 20, 2013 at 2:42:11 PM UTC+1, Tucker, Mark wrote:

We need to be careful about whether we are talking about Images or Documents. And, within Images, we should distinguish PACS vs Scanned images.

We (Regenstrief/Indiana) currently store decades of documents in our database.

The text documents fit in easily, and are snappy.

The rub happens when they are scanned documents …. They really start feeling like images, and you start thinking of out-of-line storage. We certainly do not
store Radiology images ….Our architecture includes a box labeled PACS to handle them.

I would strongly prefer to store text reports within the RDB. Most text documents are very small, and I would think any EMR worth its salt already supports
documents. From an architectural point of view those EMRS, when acting as SHR’ handle documents internally.

We also have plenty of good experience storing pointers to out-of-line images within the RDB.

In our legacy system, the scanned documents (which occupy orders of magnitude less space than the PACS uses) are stored in a separate BLOB table, outside of
the clinical discrete results, and outside of our text-report-body table.

In our new system, we mix the scanned images with text-report-bodies. I’m not thrilled with that, but it works.

(I like it separate because it is easier to manage smaller tables. Plus, by using a separate RDB table, it makes it easier to swap that out and swap in some
sort of image store if the storage problem gets out of hand.)

From: openh...@googlegroups.com [mailto:openh...@googlegroups.com]
On Behalf Of Ryan Crichton
Sent: Tuesday, August 20, 2013 3:12 AM
To: Carl Fourie
Cc: Kari Schoonbee; openh...@googlegroups.com
Subject: Re: Our recommended tool to create a SHR: OpenMRS

Hi Kari, Carl,

Thanks for raising these issues. This is definitely the right time to be delving into some of the more design specific details of the SHR. Our focus for the coming calls will be to discuss these issues in detail and produce design documentation
describing our feel for the best way to approach these issues.

I agree storing documents in a relation database may not be the most ideal. It would be interesting to explore how OpenXDS manages documents, that could help us in identifying a direction that is known to work.

Cheers,

Ryan

On Fri, Aug 16, 2013 at 1:18 PM, Carl Fourie ca...@jembi.org wrote:

Thanks for raising this issue Kari (and for the offline discussion). I think these are great issues to think of addressing as part of the design concepts as we move forward. Looking forward to seeing the rest of the group weigh in too.

On a first pass my thought is that we need to decide at a high level the best method to handle document data types (my preference is a document based db – my bias is now out there) and then figure out the technical challenges in this.
As you have raised there are concerns around the management of referencial integrity between the two as well as the ability to audit too. Personally I think we can design a solution that allows us to mitigate it but I would love to hear the larger groups thoughts
on this too.

Cheers,

Carl

On Fri, Aug 16, 2013 at 11:44 AM, Kari Schoonbee ka...@jembi.org wrote:

Thanks Ryan, I think OpenMRS is definitely a good choice and I feel confident that we can build a great SHR tool around it.

I wonder if we are ready to start delving a little bit more into the design of the SHR around OpenMRS now? As stated in the recommendation documents, the largest issue we face around OpenMRS is the storing of document-based data. I’ve been
reading up on the pro’s and con’s of storing blobs in relational databases and think this will make for quite an interesting discussion.

Personally I feel we should NOT be storing BLOBS (images etc) directly in the obs table. This could lead the database to grow in size very quickly and from what I gather it will lead to much slower performance, as the database is usually
the bottleneck. It also consumes more space than storing files directly on the filesystem and it makes your backup process much more cumbersome. For example, you can do incremental backups of a directory on a FS off-site using rsync, but you can’t do this
when everything is stored in a database that could grow to be 100’s of GB or TB’s due to network bandwidth limitations.

The main issue with storing files separately (whether on a filesystem or in a distributed NoSQL type database) is that referential integrity and auditing is harder to achieve. Typically this would require a file path or URL to the file
to be stored as a simple observation in the database, but it’s possible that files can get deleted or changed without this being reflected in the record in the database. SQL Server 2008 allows you to have a file pointer as a datatype, but not MySQL or PostgreSQL,
which is what OpenMRS uses.

I look forward to hearing other people’s thoughts on this.

Kind regards,

Kari

On Wednesday, August 14, 2013 9:38:45 AM UTC+2, Ryan Crichton wrote:

Hi all,

Over the last few months we have been reviewing and tools that could be used as a SHR and have produced a document (link ) that
explains the conclusions of this review and explains our recommendation. On the community call yesterday we came to the decision that we are happy with the recommendation that is set out in the document and we should move forward with it. To summaries this
recommendation is as follows:

OpenMRS is the tool we believe we can have the most success with and we believe it will provide a good base for creating a SHR. We recommend that we move forward with OpenMRS as the tool to use to create a SHR for the needs of OpenHIE.

We also note that there is the RAMRS tool that Regenstrief have developed that also could act as a SHR and this will be taken forward by the Regenstrief team in a separate track to see if it could fit the role to. However, the focus of
this community will be to move forward with OpenMRS as the technology choice for the SHR.

If you were not on the call and have specific comments that you would like to add or concerns you would like to raise, please do so. We’d be happy to hear them.

Thanks all for the hard work in getting to this point.

Cheers,

Ryan

–

Ryan Crichton

Software Developer, Jembi Health Systems | SOUTH AFRICA

Mobile:
+27845829934 | Skype: ryan.graham.crichton
E-mail: ry...@jembi.org

–
You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.
To unsubscribe from this group and stop receiving emails from it, send an email to
openhie-shr...@googlegroups.com.
For more options, visit
https://groups.google.com/groups/opt_out.

–
Carl Fourie*
Assistant Director of Programs, Jembi Health Systems | SOUTH AFRICA
Mobile:
+27 71 540 4477 | Office:
+27 21 701 0939 | Skype: carl.fourie17
E-mail: ca...@jembi.org*

–
You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.
To unsubscribe from this group and stop receiving emails from it, send an email to
openhie-shr...@googlegroups.com.
For more options, visit
https://groups.google.com/groups/opt_out.

–

Ryan Crichton

Software Developer, Jembi Health Systems | SOUTH AFRICA

Mobile: +27845829934 | Skype: ryan.graham.crichton
E-mail: ry...@jembi.org

–
You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.
To unsubscribe from this group and stop receiving emails from it, send an email to
openhie-shr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hannes_Venter · November 18, 2016, 5:16am

Hi Steve,

Yes, we did end up creating modules to handle the unstructured data, there are two options actually:

the OpenSHR content handler module has a default blob handler that stores it straight to the file system
then there’s the unstructured data module that uses riak nosql

(this doc might help explain how these modules fit together in the SHR)

In both cases it’s a reuse of OpenMRS’s complex observations.

Hope this helps.

Kind Regards

Hannes

···

On 17 November 2016 at 11:44, ‘Steve Ross-Talbot’ via Shared Health Record (OpenHIE) openhie-shr@googlegroups.com wrote:

Old thread I know, but I really need some sort of mechanism to store large things (voice notes and documents).

Did anyone sort out the document (or BLOB) storage issue with OpenMRS? And did it get remitted back into open source? If so where is that open source?

Thanks

Steve T

On Tuesday, August 20, 2013 at 2:42:11 PM UTC+1, Tucker, Mark wrote:

We need to be careful about whether we are talking about Images or Documents. And, within Images, we should distinguish PACS vs Scanned images.

We (Regenstrief/Indiana) currently store decades of documents in our database.

The text documents fit in easily, and are snappy.

The rub happens when they are scanned documents …. They really start feeling like images, and you start thinking of out-of-line storage. We certainly do not
store Radiology images ….Our architecture includes a box labeled PACS to handle them.

I would strongly prefer to store text reports within the RDB. Most text documents are very small, and I would think any EMR worth its salt already supports
documents. From an architectural point of view those EMRS, when acting as SHR’ handle documents internally.

We also have plenty of good experience storing pointers to out-of-line images within the RDB.

In our legacy system, the scanned documents (which occupy orders of magnitude less space than the PACS uses) are stored in a separate BLOB table, outside of
the clinical discrete results, and outside of our text-report-body table.

In our new system, we mix the scanned images with text-report-bodies. I’m not thrilled with that, but it works.

(I like it separate because it is easier to manage smaller tables. Plus, by using a separate RDB table, it makes it easier to swap that out and swap in some
sort of image store if the storage problem gets out of hand.)

From: openh...@googlegroups.com [mailto:openh...@googlegroups.com]
On Behalf Of Ryan Crichton
Sent: Tuesday, August 20, 2013 3:12 AM
To: Carl Fourie
Cc: Kari Schoonbee; openh...@googlegroups.com
Subject: Re: Our recommended tool to create a SHR: OpenMRS

Hi Kari, Carl,

Thanks for raising these issues. This is definitely the right time to be delving into some of the more design specific details of the SHR. Our focus for the coming calls will be to discuss these issues in detail and produce design documentation
describing our feel for the best way to approach these issues.

I agree storing documents in a relation database may not be the most ideal. It would be interesting to explore how OpenXDS manages documents, that could help us in identifying a direction that is known to work.

Cheers,

Ryan

On Fri, Aug 16, 2013 at 1:18 PM, Carl Fourie ca...@jembi.org wrote:

Thanks for raising this issue Kari (and for the offline discussion). I think these are great issues to think of addressing as part of the design concepts as we move forward. Looking forward to seeing the rest of the group weigh in too.

On a first pass my thought is that we need to decide at a high level the best method to handle document data types (my preference is a document based db – my bias is now out there) and then figure out the technical challenges in this.
As you have raised there are concerns around the management of referencial integrity between the two as well as the ability to audit too. Personally I think we can design a solution that allows us to mitigate it but I would love to hear the larger groups thoughts
on this too.

Cheers,

Carl

On Fri, Aug 16, 2013 at 11:44 AM, Kari Schoonbee ka...@jembi.org wrote:

Thanks Ryan, I think OpenMRS is definitely a good choice and I feel confident that we can build a great SHR tool around it.

I wonder if we are ready to start delving a little bit more into the design of the SHR around OpenMRS now? As stated in the recommendation documents, the largest issue we face around OpenMRS is the storing of document-based data. I’ve been
reading up on the pro’s and con’s of storing blobs in relational databases and think this will make for quite an interesting discussion.

Personally I feel we should NOT be storing BLOBS (images etc) directly in the obs table. This could lead the database to grow in size very quickly and from what I gather it will lead to much slower performance, as the database is usually
the bottleneck. It also consumes more space than storing files directly on the filesystem and it makes your backup process much more cumbersome. For example, you can do incremental backups of a directory on a FS off-site using rsync, but you can’t do this
when everything is stored in a database that could grow to be 100’s of GB or TB’s due to network bandwidth limitations.

The main issue with storing files separately (whether on a filesystem or in a distributed NoSQL type database) is that referential integrity and auditing is harder to achieve. Typically this would require a file path or URL to the file
to be stored as a simple observation in the database, but it’s possible that files can get deleted or changed without this being reflected in the record in the database. SQL Server 2008 allows you to have a file pointer as a datatype, but not MySQL or PostgreSQL,
which is what OpenMRS uses.

I look forward to hearing other people’s thoughts on this.

Kind regards,

Kari

On Wednesday, August 14, 2013 9:38:45 AM UTC+2, Ryan Crichton wrote:

Hi all,

Over the last few months we have been reviewing and tools that could be used as a SHR and have produced a document (link ) that
explains the conclusions of this review and explains our recommendation. On the community call yesterday we came to the decision that we are happy with the recommendation that is set out in the document and we should move forward with it. To summaries this
recommendation is as follows:

OpenMRS is the tool we believe we can have the most success with and we believe it will provide a good base for creating a SHR. We recommend that we move forward with OpenMRS as the tool to use to create a SHR for the needs of OpenHIE.

We also note that there is the RAMRS tool that Regenstrief have developed that also could act as a SHR and this will be taken forward by the Regenstrief team in a separate track to see if it could fit the role to. However, the focus of
this community will be to move forward with OpenMRS as the technology choice for the SHR.

If you were not on the call and have specific comments that you would like to add or concerns you would like to raise, please do so. We’d be happy to hear them.

Thanks all for the hard work in getting to this point.

Cheers,

Ryan

–

Ryan Crichton

Software Developer, Jembi Health Systems | SOUTH AFRICA

Mobile:
+27845829934 | Skype: ryan.graham.crichton
E-mail: ry...@jembi.org

–

You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.

To unsubscribe from this group and stop receiving emails from it, send an email to
openhie-shr...@googlegroups.com.

For more options, visit
https://groups.google.com/groups/opt_out.

–
Carl Fourie*
Assistant Director of Programs, Jembi Health Systems | SOUTH AFRICA
Mobile:
+27 71 540 4477 | Office:
+27 21 701 0939 | Skype: carl.fourie17
E-mail: ca...@jembi.org*

–

You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.

To unsubscribe from this group and stop receiving emails from it, send an email to
openhie-shr...@googlegroups.com.

For more options, visit
https://groups.google.com/groups/opt_out.

–

Ryan Crichton

Software Developer, Jembi Health Systems | SOUTH AFRICA

Mobile: +27845829934 | Skype: ryan.graham.crichton
E-mail: ry...@jembi.org

–

You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.

To unsubscribe from this group and stop receiving emails from it, send an email to
openhie-shr...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

–

You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.

To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

–

Hannes Venter

Senior Software Developer

Jembi Health Systems | SOUTH AFRICA

Mobile: +27 73 276 2848 | Office: +27 21 701 0939 | Skype: venter.johannes

E-mail: hannes@jembi.org

This e-mail contains proprietary and confidential information some or all of which may be legally privileged. It is for the intended recipient only. If an addressing or transmission error has misdirected this e-mail, please notify the author by replying to this e-mail and then deleting same. If you are not the intended recipient you must not use, disclose, distribute, copy, print or rely on this e-mail. Jembi Health Systems NPO, its subsidiaries and associated companies is not liable for the security of information sent by e-mail and accepts no liability of whatsoever nature for any loss, damage or expense resulting, directly or indirectly, from the access of this e-mail or any attachments hereto.