Horizontal Scalability

If we can do “it” in parallel, we don’t have to worry so much about scaling OpenMRS to millions of patients.

horizontal_scalability.txt (1.82 KB)

···

Mark Tucker

Systems Engineer

Regenstrief Institute

(317)423-5552

mtucker2@regenstrief.org

Confidentiality Notice: The contents of this message and any files transmitted with it may contain confidential and/or privileged information and are intended
solely for the use of the named addressee(s). Additionally, the information contained herein may have been disclosed to you from medical records with confidentiality protected by federal and state laws. Federal regulations and State laws prohibit you from
making further disclosure of such information without the specific written consent of the person to whom the information pertains or as otherwise permitted by such regulations. A general authorization for the release of medical or other information is not
sufficient for this purpose.

If you have received this message in error, please notify the sender by return e-mail and delete the original message. Any retention, disclosure, copying, distribution or use of this information by anyone other than the intended recipient is strictly prohibited

Hi Mark,

Yes, this could work well: Mod10 scaling is a very elegant idea.

And I absolutely support the idea of scaling in parallel.

The only problem I can see is that it’ll be difficult to scale on an ad-hoc basis,

i.e. it’ll be difficult to add extra nodes to the cluster after the system is already running.

The reason is that if we were figuring out which node a patient’s data should

go into by modding the MRN by X, where X is the current cluster size, we can’t increase X

without affecting existing patient data.

I.e. if a patient was allocated to node # mod X, and we increase X with Y nodes,

then they should now be allocated to node # mod X+Y.

Meaning that all existing records would need to be moved to their new nodes.

It could of course be done, but might mean a significant down-time for the system,

esp. with large numbers of records. In fact it might not even be practical with very large numbers of records.

(It’s basically the same issue as dynamic allocation in a dictionary/hashtable data structure,

if you resize underlying array, you need to perform a rehash and re-allocate all the elements

from the old array to the new one (http://en.wikipedia.org/wiki/Hashtable#Dynamic_resizing)).

Of course I’m not trying to imply that this is a prohibitive issue,

we just need to be aware that if we go this route, we might have to accept this limitation.

Does anybody have other ideas about how to address this?

Does it even need to be addressed, or could we be happy with

always having, say, 10 nodes? (Certainly having 10 nodes that can’t scale adhoc

is much better than 1 OpenMRS instance that can’t scale adhoc)

Cheers

Hannes

···

On Fri, Apr 19, 2013 at 10:12 PM, Tucker, Mark mtucker2@regenstrief.org wrote:

If we can do “it” in parallel, we don’t have to worry so much about scaling OpenMRS to millions of patients.

Mark Tucker

Systems Engineer

Regenstrief Institute

(317)423-5552

mtucker2@regenstrief.org

Confidentiality Notice: The contents of this message and any files transmitted with it may contain confidential and/or privileged information and are intended
solely for the use of the named addressee(s). Additionally, the information contained herein may have been disclosed to you from medical records with confidentiality protected by federal and state laws. Federal regulations and State laws prohibit you from
making further disclosure of such information without the specific written consent of the person to whom the information pertains or as otherwise permitted by such regulations. A general authorization for the release of medical or other information is not
sufficient for this purpose.

If you have received this message in error, please notify the sender by return e-mail and delete the original message. Any retention, disclosure, copying, distribution or use of this information by anyone other than the intended recipient is strictly prohibited

You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.

To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.


Hannes Venter
Software Developer, Jembi Health Systems | SOUTH AFRICA
Mobile: +27 73 276 2848 | Office: +27 21 701 0939 | Skype: venter.johannes
E-mail: hannes@jembi.org

The notion of horizontal scalability is a great one. Being able to more SHR nodes dynamically would be ideal as then we could scale out as needed much more easily. To generalise, we don’t need to focus too much on the implementation of the distribution of nodes, but rather note that it is a important factor to consider in the design of a SHR.

Cheers,

Ryan

···

On Mon, Apr 22, 2013 at 10:33 AM, Hannes Venter hannes@jembi.org wrote:

Hi Mark,

Yes, this could work well: Mod10 scaling is a very elegant idea.

And I absolutely support the idea of scaling in parallel.

The only problem I can see is that it’ll be difficult to scale on an ad-hoc basis,

i.e. it’ll be difficult to add extra nodes to the cluster after the system is already running.

The reason is that if we were figuring out which node a patient’s data should

go into by modding the MRN by X, where X is the current cluster size, we can’t increase X

without affecting existing patient data.

I.e. if a patient was allocated to node # mod X, and we increase X with Y nodes,

then they should now be allocated to node # mod X+Y.

Meaning that all existing records would need to be moved to their new nodes.

It could of course be done, but might mean a significant down-time for the system,

esp. with large numbers of records. In fact it might not even be practical with very large numbers of records.

(It’s basically the same issue as dynamic allocation in a dictionary/hashtable data structure,

if you resize underlying array, you need to perform a rehash and re-allocate all the elements

from the old array to the new one (http://en.wikipedia.org/wiki/Hashtable#Dynamic_resizing)).

Of course I’m not trying to imply that this is a prohibitive issue,

we just need to be aware that if we go this route, we might have to accept this limitation.

Does anybody have other ideas about how to address this?

Does it even need to be addressed, or could we be happy with

always having, say, 10 nodes? (Certainly having 10 nodes that can’t scale adhoc

is much better than 1 OpenMRS instance that can’t scale adhoc)

Cheers

Hannes

You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.

To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.


Ryan Crichton

Senior Software Developer, Jembi Health Systems | SOUTH AFRICA

Mobile: +27845829934 | Skype: ryan.graham.crichton
E-mail: ryan@jembi.org

On Fri, Apr 19, 2013 at 10:12 PM, Tucker, Mark mtucker2@regenstrief.org wrote:

If we can do “it” in parallel, we don’t have to worry so much about scaling OpenMRS to millions of patients.

Mark Tucker

Systems Engineer

Regenstrief Institute

(317)423-5552

mtucker2@regenstrief.org

Confidentiality Notice: The contents of this message and any files transmitted with it may contain confidential and/or privileged information and are intended
solely for the use of the named addressee(s). Additionally, the information contained herein may have been disclosed to you from medical records with confidentiality protected by federal and state laws. Federal regulations and State laws prohibit you from
making further disclosure of such information without the specific written consent of the person to whom the information pertains or as otherwise permitted by such regulations. A general authorization for the release of medical or other information is not
sufficient for this purpose.

If you have received this message in error, please notify the sender by return e-mail and delete the original message. Any retention, disclosure, copying, distribution or use of this information by anyone other than the intended recipient is strictly prohibited

You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.

To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.


Hannes Venter
Software Developer, Jembi Health Systems | SOUTH AFRICA
Mobile: +27 73 276 2848 | Office: +27 21 701 0939 | Skype: venter.johannes
E-mail: hannes@jembi.org

Horizontal scaling is much more easily achieved with NoSQL databases, which we will probably end up using for the document store. As this type of database is also likely to increase in size faster (due to handling larger files like images, audio and video) I think it’s most important that this part of the SHR be able to scale dynamically.

For the discrete data store, we’d have to figure out how large the database is likely to get. If the entire discrete db can fit on a single machine’s storage we can implement synchronous replication between nodes (so that each node contains all the data) and use a load balancer to distribute read access between them. This will allow use to add new nodes and just configure the load balancer in order to access them.

I’m not an expert on database designs so please feel free to correct me if I’m completely wrong. :wink:

Kari

···

On Mon, Apr 22, 2013 at 10:33 AM, Hannes Venter hannes@jembi.org wrote:

Hi Mark,

Yes, this could work well: Mod10 scaling is a very elegant idea.

And I absolutely support the idea of scaling in parallel.

The only problem I can see is that it’ll be difficult to scale on an ad-hoc basis,

i.e. it’ll be difficult to add extra nodes to the cluster after the system is already running.

The reason is that if we were figuring out which node a patient’s data should

go into by modding the MRN by X, where X is the current cluster size, we can’t increase X

without affecting existing patient data.

I.e. if a patient was allocated to node # mod X, and we increase X with Y nodes,

then they should now be allocated to node # mod X+Y.

Meaning that all existing records would need to be moved to their new nodes.

It could of course be done, but might mean a significant down-time for the system,

esp. with large numbers of records. In fact it might not even be practical with very large numbers of records.

(It’s basically the same issue as dynamic allocation in a dictionary/hashtable data structure,

if you resize underlying array, you need to perform a rehash and re-allocate all the elements

from the old array to the new one (http://en.wikipedia.org/wiki/Hashtable#Dynamic_resizing)).

Of course I’m not trying to imply that this is a prohibitive issue,

we just need to be aware that if we go this route, we might have to accept this limitation.

Does anybody have other ideas about how to address this?

Does it even need to be addressed, or could we be happy with

always having, say, 10 nodes? (Certainly having 10 nodes that can’t scale adhoc

is much better than 1 OpenMRS instance that can’t scale adhoc)

Cheers

Hannes

You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.

To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

On Fri, Apr 19, 2013 at 10:12 PM, Tucker, Mark mtucker2@regenstrief.org wrote:

If we can do “it” in parallel, we don’t have to worry so much about scaling OpenMRS to millions of patients.

Mark Tucker

Systems Engineer

Regenstrief Institute

(317)423-5552

mtucker2@regenstrief.org

Confidentiality Notice: The contents of this message and any files transmitted with it may contain confidential and/or privileged information and are intended
solely for the use of the named addressee(s). Additionally, the information contained herein may have been disclosed to you from medical records with confidentiality protected by federal and state laws. Federal regulations and State laws prohibit you from
making further disclosure of such information without the specific written consent of the person to whom the information pertains or as otherwise permitted by such regulations. A general authorization for the release of medical or other information is not
sufficient for this purpose.

If you have received this message in error, please notify the sender by return e-mail and delete the original message. Any retention, disclosure, copying, distribution or use of this information by anyone other than the intended recipient is strictly prohibited

You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.

To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.


Hannes Venter
Software Developer, Jembi Health Systems | SOUTH AFRICA
Mobile: +27 73 276 2848 | Office: +27 21 701 0939 | Skype: venter.johannes
E-mail: hannes@jembi.org

The fun thing about this kind of horizontal scalability is that it does not require anything tricky in each DB node.

They will not synchronize between themselves.

You can imagine that the HIM, on receipt of a message, does a MRN->Node# lookup.

Once we pay for a simple lookup (instead of a computation on an MRN), then MRN’s can be distributed to nodes arbitrarily.

Ie, start with 8 nodes, and add new patients to node 9. when node 9 gets busy, create a new node, and start allocating to node 10.

NOTE: If we believe that our on-checkin-patient HL7 download is “complete” (That is, all clinical data can be transmitted “authentically” in the HL7 download),
then migrating a patient from node to node is as easy as

[1] generate full dump of Patient 3 on Node#5

[2] load full dump into Node#6.

On Behalf Of Ryan

···

The notion of horizontal scalability is a great one. Being able to more SHR nodes dynamically would be ideal as then we could scale out as needed much more easily. To generalise, we don’t need to focus too much on the implementation of
the distribution of nodes, but rather note that it is a important factor to consider in the design of a SHR.

Cheers,

Ryan

On Mon, Apr 22, 2013 at 10:33 AM, Hannes Venter hannes@jembi.org wrote:

Hi Mark,

Yes, this could work well: Mod10 scaling is a very elegant idea.

And I absolutely support the idea of scaling in parallel.

The only problem I can see is that it’ll be difficult to scale on an ad-hoc basis,

i.e. it’ll be difficult to add extra nodes to the cluster after the system is already running.

The reason is that if we were figuring out which node a patient’s data should

go into by modding the MRN by X, where X is the current cluster size, we can’t increase X

without affecting existing patient data.

I.e. if a patient was allocated to node # mod X, and we increase X with Y nodes,

then they should now be allocated to node # mod X+Y.

Meaning that all existing records would need to be moved to their new nodes.

It could of course be done, but might mean a significant down-time for the system,

esp. with large numbers of records. In fact it might not even be practical with very large numbers of records.

(It’s basically the same issue as dynamic allocation in a dictionary/hashtable data structure,

if you resize underlying array, you need to perform a rehash and re-allocate all the elements

from the old array to the new one (http://en.wikipedia.org/wiki/Hashtable#Dynamic_resizing)).

Of course I’m not trying to imply that this is a prohibitive issue,

we just need to be aware that if we go this route, we might have to accept this limitation.

Does anybody have other ideas about how to address this?

Does it even need to be addressed, or could we be happy with

always having, say, 10 nodes? (Certainly having 10 nodes that can’t scale adhoc

is much better than 1 OpenMRS instance that can’t scale adhoc)

Cheers

Hannes

On Fri, Apr 19, 2013 at 10:12 PM, Tucker, Mark mtucker2@regenstrief.org wrote:

If we can do “it” in parallel, we don’t have to worry so much about scaling OpenMRS to millions of patients.

Mark Tucker

Systems Engineer

Regenstrief Institute

(317)423-5552

mtucker2@regenstrief.org

Confidentiality Notice: The contents of this message and any files transmitted with it may contain confidential
and/or privileged information and are intended solely for the use of the named addressee(s). Additionally, the information contained herein may have been disclosed to you from medical records with confidentiality protected by federal and state laws. Federal
regulations and State laws prohibit you from making further disclosure of such information without the specific written consent of the person to whom the information pertains or as otherwise permitted by such regulations. A general authorization for the release
of medical or other information is not sufficient for this purpose.

If you have received this message in error, please notify the sender by return e-mail and delete the original message. Any retention, disclosure, copying, distribution or use of this information by anyone other than the intended recipient is strictly prohibited

You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.

To unsubscribe from this group and stop receiving emails from it, send an email to
openhie-shr+unsubscribe@googlegroups.com.

For more options, visit
https://groups.google.com/groups/opt_out
.


Hannes Venter*
Software Developer, Jembi Health Systems | SOUTH AFRICA
Mobile:
+27 73 276 2848 | Office:
+27 21 701 0939 | Skype: venter.johannes
E-mail: hannes@jembi.org*


You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.
To unsubscribe from this group and stop receiving emails from it, send an email to
openhie-shr+unsubscribe@googlegroups.com.
For more options, visit
https://groups.google.com/groups/opt_out
.

Ryan Crichton

Senior Software Developer, Jembi Health Systems | SOUTH AFRICA

Mobile: +27845829934 | Skype: ryan.graham.crichton
E-mail: ryan@jembi.org

Hi all.

A piece of happy news is that, if we don’t want to “roll our own” in terms of how scale-out happens, a number of commercial and open source database offerings automatigically scale out to multiple parallel instances. We could leave the “how” up to implementers, who could always just pick one of those.

It is interesting and useful, however, to contemplate where we expect scaling should “happen”. Does the HIM layer have to know where all the SHRs are, or does the SHR “service” abstract multiple physical SHRs to the HIM? How we do this may also depend on what will be used as our interface to communicate with the SHR. For example, are we using XDS, which automatically supports federated XDS repositories – or are we using something proprietary that perhaps forces us to “roll our own”?

Food for thought, and discussion, I hope… J

DJ

Derek Ritz, P.Eng., CPHIMS-CA

ecGroup Inc.

+1 (905) 515-0045

www.ecgroupinc.com

This communication is intended only for the party to whom it is addressed, and may contain information which is privileged or confidential. Any other delivery, distribution, copying or disclosure is strictly prohibited and is not a waiver of privilege or confidentiality. If you have received this telecommunication in error, please notify the sender immediately by return electronic mail and destroy the message and any attachments.

···

Le présent courriel et les documents qui y sont joints sont confidentiels et protégés et s’adressent exclusivement au destinataire mentionné ci-dessus. L’expéditeur ne renonce pas aux droits et privilèges qui s’y rapportent ni à leur caractère confidentiel. Toute prise de connaissance, diffusion, utilisation ou reproduction de ce message ou des documents qui y sont joints, ainsi que des renseignements que chacun contient, par une personne autre que le destinataire prévu est interdite. Si vous recevez ce courriel par erreur, veuillez le détruire immédiatement et m’en informer.

From: openhie-shr@googlegroups.com [mailto:openhie-shr@googlegroups.com] On Behalf Of Tucker, Mark
Sent: April 22, 2013 12:40 PM
To: openhie-shr
Subject: RE: Horizontal Scalability

The fun thing about this kind of horizontal scalability is that it does not require anything tricky in each DB node.

They will not synchronize between themselves.

You can imagine that the HIM, on receipt of a message, does a MRN->Node# lookup.

Once we pay for a simple lookup (instead of a computation on an MRN), then MRN’s can be distributed to nodes arbitrarily.

Ie, start with 8 nodes, and add new patients to node 9. when node 9 gets busy, create a new node, and start allocating to node 10.

NOTE: If we believe that our on-checkin-patient HL7 download is “complete” (That is, all clinical data can be transmitted “authentically” in the HL7 download), then migrating a patient from node to node is as easy as

[1] generate full dump of Patient 3 on Node#5

[2] load full dump into Node#6.

From: rg.crichton@gmail.com [mailto:rg.crichton@gmail.com] On Behalf Of Ryan
Sent: Monday, April 22, 2013 5:40 AM
To: Hannes Venter
Cc: Tucker, Mark; openhie-shr
Subject: Re: Horizontal Scalability

The notion of horizontal scalability is a great one. Being able to more SHR nodes dynamically would be ideal as then we could scale out as needed much more easily. To generalise, we don’t need to focus too much on the implementation of the distribution of nodes, but rather note that it is a important factor to consider in the design of a SHR.

Cheers,

Ryan

On Mon, Apr 22, 2013 at 10:33 AM, Hannes Venter hannes@jembi.org wrote:

Hi Mark,

Yes, this could work well: Mod10 scaling is a very elegant idea.

And I absolutely support the idea of scaling in parallel.

The only problem I can see is that it’ll be difficult to scale on an ad-hoc basis,

i.e. it’ll be difficult to add extra nodes to the cluster after the system is already running.

The reason is that if we were figuring out which node a patient’s data should

go into by modding the MRN by X, where X is the current cluster size, we can’t increase X

without affecting existing patient data.

I.e. if a patient was allocated to node # mod X, and we increase X with Y nodes,

then they should now be allocated to node # mod X+Y.

Meaning that all existing records would need to be moved to their new nodes.

It could of course be done, but might mean a significant down-time for the system,

esp. with large numbers of records. In fact it might not even be practical with very large numbers of records.

(It’s basically the same issue as dynamic allocation in a dictionary/hashtable data structure,

if you resize underlying array, you need to perform a rehash and re-allocate all the elements

from the old array to the new one (http://en.wikipedia.org/wiki/Hashtable#Dynamic_resizing)).

Of course I’m not trying to imply that this is a prohibitive issue,

we just need to be aware that if we go this route, we might have to accept this limitation.

Does anybody have other ideas about how to address this?

Does it even need to be addressed, or could we be happy with

always having, say, 10 nodes? (Certainly having 10 nodes that can’t scale adhoc

is much better than 1 OpenMRS instance that can’t scale adhoc)

Cheers

Hannes

On Fri, Apr 19, 2013 at 10:12 PM, Tucker, Mark mtucker2@regenstrief.org wrote:

If we can do “it” in parallel, we don’t have to worry so much about scaling OpenMRS to millions of patients.

Mark Tucker

Systems Engineer

Regenstrief Institute

(317)423-5552

mtucker2@regenstrief.org

Confidentiality Notice: The contents of this message and any files transmitted with it may contain confidential and/or privileged information and are intended solely for the use of the named addressee(s). Additionally, the information contained herein may have been disclosed to you from medical records with confidentiality protected by federal and state laws. Federal regulations and State laws prohibit you from making further disclosure of such information without the specific written consent of the person to whom the information pertains or as otherwise permitted by such regulations. A general authorization for the release of medical or other information is not sufficient for this purpose.

If you have received this message in error, please notify the sender by return e-mail and delete the original message. Any retention, disclosure, copying, distribution or use of this information by anyone other than the intended recipient is strictly prohibited


You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.
To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Hannes Venter**Software Developer, Jembi Health Systems | SOUTH AFRICA
Mobile: +27 73 276 2848 | Office: +27 21 701 0939 | Skype: venter.johannes
E-mail: hannes@jembi.org


You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.
To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ryan Crichton

Senior Software Developer, Jembi Health Systems | SOUTH AFRICA

Mobile: +27845829934 | Skype: ryan.graham.crichton
E-mail: ryan@jembi.org


You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.
To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi all,

So in terms of the requirements I think what we are really all saying is that being able to distribute horizontally is a feature that wee would want for a SHR. This seems fairly obvious. This think the implementation of that would depend on the technologies at play in the final solution.

Derek, you raise an interesting point, should the HIM be aware of the multiple distributed instances? My initial though is no. That should be something abstracted away by the SHR such that adding additional nodes can be transparent to the HIM. The SHR should handle the complexity of distributing the load between its nodes. It would be interesting to hear what other have to say about this.

Cheers,

Ryan

···

On Mon, Apr 22, 2013 at 7:33 PM, Derek Ritz (ecGroup) derek.ritz@ecgroupinc.com wrote:

Hi all.

A piece of happy news is that, if we don’t want to “roll our own” in terms of how scale-out happens, a number of commercial and open source database offerings automatigically scale out to multiple parallel instances. We could leave the “how” up to implementers, who could always just pick one of those.

It is interesting and useful, however, to contemplate where we expect scaling should “happen”. Does the HIM layer have to know where all the SHRs are, or does the SHR “service” abstract multiple physical SHRs to the HIM? How we do this may also depend on what will be used as our interface to communicate with the SHR. For example, are we using XDS, which automatically supports federated XDS repositories – or are we using something proprietary that perhaps forces us to “roll our own”?

Food for thought, and discussion, I hope… J

DJ


Derek Ritz, P.Eng., CPHIMS-CA

ecGroup Inc.

+1 (905) 515-0045

www.ecgroupinc.com

This communication is intended only for the party to whom it is addressed, and may contain information which is privileged or confidential. Any other delivery, distribution, copying or disclosure is strictly prohibited and is not a waiver of privilege or confidentiality. If you have received this telecommunication in error, please notify the sender immediately by return electronic mail and destroy the message and any attachments.


Le présent courriel et les documents qui y sont joints sont confidentiels et protégés et s’adressent exclusivement au destinataire mentionné ci-dessus. L’expéditeur ne renonce pas aux droits et privilèges qui s’y rapportent ni à leur caractère confidentiel. Toute prise de connaissance, diffusion, utilisation ou reproduction de ce message ou des documents qui y sont joints, ainsi que des renseignements que chacun contient, par une personne autre que le destinataire prévu est interdite. Si vous recevez ce courriel par erreur, veuillez le détruire immédiatement et m’en informer.

From: openhie-shr@googlegroups.com [mailto:openhie-shr@googlegroups.com] On Behalf Of Tucker, Mark
Sent: April 22, 2013 12:40 PM
To: openhie-shr
Subject: RE: Horizontal Scalability

The fun thing about this kind of horizontal scalability is that it does not require anything tricky in each DB node.

They will not synchronize between themselves.

You can imagine that the HIM, on receipt of a message, does a MRN->Node# lookup.

Once we pay for a simple lookup (instead of a computation on an MRN), then MRN’s can be distributed to nodes arbitrarily.

Ie, start with 8 nodes, and add new patients to node 9. when node 9 gets busy, create a new node, and start allocating to node 10.

NOTE: If we believe that our on-checkin-patient HL7 download is “complete” (That is, all clinical data can be transmitted “authentically” in the HL7 download), then migrating a patient from node to node is as easy as

[1] generate full dump of Patient 3 on Node#5

[2] load full dump into Node#6.

From: rg.crichton@gmail.com [mailto:rg.crichton@gmail.com] On Behalf Of Ryan
Sent: Monday, April 22, 2013 5:40 AM
To: Hannes Venter
Cc: Tucker, Mark; openhie-shr
Subject: Re: Horizontal Scalability

The notion of horizontal scalability is a great one. Being able to more SHR nodes dynamically would be ideal as then we could scale out as needed much more easily. To generalise, we don’t need to focus too much on the implementation of the distribution of nodes, but rather note that it is a important factor to consider in the design of a SHR.

Cheers,

Ryan

On Mon, Apr 22, 2013 at 10:33 AM, Hannes Venter hannes@jembi.org wrote:

Hi Mark,

Yes, this could work well: Mod10 scaling is a very elegant idea.

And I absolutely support the idea of scaling in parallel.

The only problem I can see is that it’ll be difficult to scale on an ad-hoc basis,

i.e. it’ll be difficult to add extra nodes to the cluster after the system is already running.

The reason is that if we were figuring out which node a patient’s data should

go into by modding the MRN by X, where X is the current cluster size, we can’t increase X

without affecting existing patient data.

I.e. if a patient was allocated to node # mod X, and we increase X with Y nodes,

then they should now be allocated to node # mod X+Y.

Meaning that all existing records would need to be moved to their new nodes.

It could of course be done, but might mean a significant down-time for the system,

esp. with large numbers of records. In fact it might not even be practical with very large numbers of records.

(It’s basically the same issue as dynamic allocation in a dictionary/hashtable data structure,

if you resize underlying array, you need to perform a rehash and re-allocate all the elements

from the old array to the new one (http://en.wikipedia.org/wiki/Hashtable#Dynamic_resizing)).

Of course I’m not trying to imply that this is a prohibitive issue,

we just need to be aware that if we go this route, we might have to accept this limitation.

Does anybody have other ideas about how to address this?

Does it even need to be addressed, or could we be happy with

always having, say, 10 nodes? (Certainly having 10 nodes that can’t scale adhoc

is much better than 1 OpenMRS instance that can’t scale adhoc)

Cheers

Hannes

On Fri, Apr 19, 2013 at 10:12 PM, Tucker, Mark mtucker2@regenstrief.org wrote:

If we can do “it” in parallel, we don’t have to worry so much about scaling OpenMRS to millions of patients.

Mark Tucker

Systems Engineer

Regenstrief Institute

(317)423-5552

mtucker2@regenstrief.org

Confidentiality Notice: The contents of this message and any files transmitted with it may contain confidential and/or privileged information and are intended solely for the use of the named addressee(s). Additionally, the information contained herein may have been disclosed to you from medical records with confidentiality protected by federal and state laws. Federal regulations and State laws prohibit you from making further disclosure of such information without the specific written consent of the person to whom the information pertains or as otherwise permitted by such regulations. A general authorization for the release of medical or other information is not sufficient for this purpose.

If you have received this message in error, please notify the sender by return e-mail and delete the original message. Any retention, disclosure, copying, distribution or use of this information by anyone other than the intended recipient is strictly prohibited


You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.

To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Hannes Venter**Software Developer, Jembi Health Systems | SOUTH AFRICA
Mobile: +27 73 276 2848 | Office: +27 21 701 0939 | Skype: venter.johannes
E-mail: hannes@jembi.org


You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.
To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

Ryan Crichton

Senior Software Developer, Jembi Health Systems | SOUTH AFRICA

Mobile: +27845829934 | Skype: ryan.graham.crichton
E-mail: ryan@jembi.org


You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.
To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.

To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.


Ryan Crichton

Senior Software Developer, Jembi Health Systems | SOUTH AFRICA

Mobile: +27845829934 | Skype: ryan.graham.crichton
E-mail: ryan@jembi.org

On this issue, I actually favor making the SHR simpler, and the HIM more sophisticated. Let the HIM know about the partitions.

For example, if the SHR was a single instance OpenMRS, then that SHR would in fact know nothing about the other SHR’s or how patients are partitioned. It
would just do its job on the patients that are assigned to it.

The actual partitioning logic should be so so so simple, ( ie, just a call to the “whatParition” function offered up by the core functionality CentralNode.whatPartition(aPatient))
that the HIM would easily do that, or even a client “might do it” (if we wanted to expose that aspect to the outside world, which I would try to avoid,.)

If we think of the SHR as simple, as it gives us flexibility in use. We could use OpenMRS, or perhaps others, out of the box as SHRs.]

On Behalf Of Ryan

···

Hi all,

So in terms of the requirements I think what we are really all saying is that being able to distribute horizontally is a feature that wee would want for a SHR. This seems fairly obvious. This think the implementation of that would depend
on the technologies at play in the final solution.

Derek, you raise an interesting point, should the HIM be aware of the multiple distributed instances? My initial though is no. That should be something abstracted away by the SHR such that adding additional nodes can be transparent to the
HIM. The SHR should handle the complexity of distributing the load between its nodes. It would be interesting to hear what other have to say about this.

Cheers,

Ryan

On Mon, Apr 22, 2013 at 7:33 PM, Derek Ritz (ecGroup) derek.ritz@ecgroupinc.com wrote:

Hi all.

A piece of happy news is that, if we don’t want to “roll our own” in terms of how scale-out happens,
a number of commercial and open source database offerings automatigically scale out to multiple parallel instances. We could leave the “how” up to implementers, who could always just pick one of those.

It is interesting and useful, however, to contemplate where we expect scaling should “happen”. Does
the HIM layer have to know where all the SHRs are, or does the SHR “service” abstract multiple physical SHRs to the HIM? How we do this may also depend on what will be used as our interface to communicate with the SHR. For example, are we using XDS, which
automatically supports federated XDS repositories – or are we using something proprietary that perhaps forces us to “roll our own”?

Food for thought, and discussion, I hope…
J

DJ


**Derek Ritz,**P.Eng., CPHIMS-CA

ecGroup Inc.

+1 (905) 515-0045

www.ecgroupinc.com

This communication is intended only for the party to whom it is addressed, and may contain
information which is privileged or confidential. Any other delivery, distribution, copying or disclosure is strictly prohibited and is not a waiver of privilege or confidentiality. If you have received this telecommunication in error, please notify the sender
immediately by return electronic mail and destroy the message and any attachments.


Le présent courriel et les documents qui y sont joints sont confidentiels et protégés et s’adressent exclusivement au destinataire mentionné ci-dessus. L’expéditeur ne
renonce pas aux droits et privilèges qui s’y rapportent ni à leur caractère confidentiel. Toute prise de connaissance, diffusion, utilisation ou reproduction de ce message ou des documents qui y sont joints, ainsi que des renseignements que chacun contient,
par une personne autre que le destinataire prévu est interdite. Si vous recevez ce courriel par erreur, veuillez le détruire immédiatement et m’en informer.

From:
openhie-shr@googlegroups.com [mailto:openhie-shr@googlegroups.com]
On Behalf Of Tucker, Mark
Sent: April 22, 2013 12:40 PM
To: openhie-shr
Subject: RE: Horizontal Scalability

The fun thing about this kind of horizontal scalability is that it does not require anything tricky
in each DB node.

They will not synchronize between themselves.

You can imagine that the HIM, on receipt of a message, does a MRN->Node# lookup.

Once we pay for a simple lookup (instead of a computation on an MRN), then MRN’s can be distributed
to nodes arbitrarily.

Ie, start with 8 nodes, and add new patients to node 9. when node 9 gets busy, create a new node,
and start allocating to node 10.

NOTE: If we believe that our on-checkin-patient HL7 download is “complete” (That is, all clinical
data can be transmitted “authentically” in the HL7 download), then migrating a patient from node to node is as easy as

[1] generate full dump of Patient 3 on Node#5

[2] load full dump into Node#6.

From:
rg.crichton@gmail.com [mailto:rg.crichton@gmail.com]
On Behalf Of Ryan
Sent: Monday, April 22, 2013 5:40 AM
To: Hannes Venter
Cc: Tucker, Mark; openhie-shr
Subject: Re: Horizontal Scalability

The notion of horizontal scalability is a great one. Being able to more SHR nodes dynamically would be ideal as then we could scale out as needed much more easily. To generalise,
we don’t need to focus too much on the implementation of the distribution of nodes, but rather note that it is a important factor to consider in the design of a SHR.

Cheers,

Ryan

On Mon, Apr 22, 2013 at 10:33 AM, Hannes Venter hannes@jembi.org wrote:

Hi Mark,

Yes, this could work well: Mod10 scaling is a very elegant idea.

And I absolutely support the idea of scaling in parallel.

The only problem I can see is that it’ll be difficult to scale on an ad-hoc basis,

i.e. it’ll be difficult to add extra nodes to the cluster after the system is already running.

The reason is that if we were figuring out which node a patient’s data should

go into by modding the MRN by X, where X is the current cluster size, we can’t increase X

without affecting existing patient data.

I.e. if a patient was allocated to node # mod X, and we increase X with Y nodes,

then they should now be allocated to node # mod X+Y.

Meaning that all existing records would need to be moved to their new nodes.

It could of course be done, but might mean a significant down-time for the system,

esp. with large numbers of records. In fact it might not even be practical with very large numbers of records.

(It’s basically the same issue as dynamic allocation in a dictionary/hashtable data structure,

if you resize underlying array, you need to perform a rehash and re-allocate all the elements

from the old array to the new one (http://en.wikipedia.org/wiki/Hashtable#Dynamic_resizing)).

Of course I’m not trying to imply that this is a prohibitive issue,

we just need to be aware that if we go this route, we might have to accept this limitation.

Does anybody have other ideas about how to address this?

Does it even need to be addressed, or could we be happy with

always having, say, 10 nodes? (Certainly having 10 nodes that can’t scale adhoc

is much better than 1 OpenMRS instance that can’t scale adhoc)

Cheers

Hannes

On Fri, Apr 19, 2013 at 10:12 PM, Tucker, Mark mtucker2@regenstrief.org wrote:

If we can do “it” in parallel, we don’t have to worry so much about scaling OpenMRS to millions of patients.

Mark Tucker

Systems Engineer

Regenstrief Institute

(317)423-5552

mtucker2@regenstrief.org

Confidentiality Notice: The contents of this message and any files transmitted with it may contain confidential
and/or privileged information and are intended solely for the use of the named addressee(s). Additionally, the information contained herein may have been disclosed to you from medical records with confidentiality protected by federal and state laws. Federal
regulations and State laws prohibit you from making further disclosure of such information without the specific written consent of the person to whom the information pertains or as otherwise permitted by such regulations. A general authorization for the release
of medical or other information is not sufficient for this purpose.

If you have received this message in error, please notify the sender by return e-mail and delete the original message. Any retention, disclosure, copying, distribution or use of this information by anyone other than the intended recipient is strictly prohibited

You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.

To unsubscribe from this group and stop receiving emails from it, send an email to
openhie-shr+unsubscribe@googlegroups.com.

For more options, visit
https://groups.google.com/groups/opt_out
.


Hannes Venter
Software Developer, Jembi Health Systems | SOUTH AFRICA
Mobile: +27 73 276 2848 | Office:
+27 21 701 0939 | Skype: venter.johannes
E-mail: hannes@jembi.org

You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.
To unsubscribe from this group and stop receiving emails from it, send an email to
openhie-shr+unsubscribe@googlegroups.com.
For more options, visit
https://groups.google.com/groups/opt_out
.

Ryan Crichton

Senior Software Developer, Jembi Health Systems | SOUTH AFRICA

Mobile:
+27845829934 | Skype: ryan.graham.crichton
E-mail: ryan@jembi.org

You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.
To unsubscribe from this group and stop receiving emails from it, send an email to
openhie-shr+unsubscribe@googlegroups.com.
For more options, visit
https://groups.google.com/groups/opt_out
.


You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.
To unsubscribe from this group and stop receiving emails from it, send an email to
openhie-shr+unsubscribe@googlegroups.com.
For more options, visit
https://groups.google.com/groups/opt_out
.

Ryan Crichton

Senior Software Developer, Jembi Health Systems | SOUTH AFRICA

Mobile: +27845829934 | Skype: ryan.graham.crichton
E-mail: ryan@jembi.org

I must admit – I favour an SHR that knows how to manage its data, including if that management must cross multiple instances in order to support scaling out. There is a wide array of products available that already know how to do that, including OpenMRS (if deployed against a MySQL cluster, for example). My sense is that this will usefully simplify things for us – including, perhaps, addressing the resync issue we’re trying to solve right now between CR merge/split and the SHR (just as one example).

As an overarching premise, I believe that we should be thinking of OpenHIE as a reference implementation more than as an implementation. That is to say – when one deploys a system there are times when (for various reasons, some of the them good… some not so much) one finds oneself making a “deal with the devil” and doing something expedient. But a reference implementation is where we have an opportunity to instantiate an exemplar of the design we SHOULD follow, whether or not we always do when it comes time to actually implement.

My $0.02…

DJ

Derek Ritz, P.Eng., CPHIMS-CA

ecGroup Inc.

+1 (905) 515-0045

www.ecgroupinc.com

This communication is intended only for the party to whom it is addressed, and may contain information which is privileged or confidential. Any other delivery, distribution, copying or disclosure is strictly prohibited and is not a waiver of privilege or confidentiality. If you have received this telecommunication in error, please notify the sender immediately by return electronic mail and destroy the message and any attachments.

···

Le présent courriel et les documents qui y sont joints sont confidentiels et protégés et s’adressent exclusivement au destinataire mentionné ci-dessus. L’expéditeur ne renonce pas aux droits et privilèges qui s’y rapportent ni à leur caractère confidentiel. Toute prise de connaissance, diffusion, utilisation ou reproduction de ce message ou des documents qui y sont joints, ainsi que des renseignements que chacun contient, par une personne autre que le destinataire prévu est interdite. Si vous recevez ce courriel par erreur, veuillez le détruire immédiatement et m’en informer.

From: openhie-shr@googlegroups.com [mailto:openhie-shr@googlegroups.com] On Behalf Of Tucker, Mark
Sent: April 24, 2013 9:27 AM
To: openhie-shr
Subject: RE: Horizontal Scalability

On this issue, I actually favor making the SHR simpler, and the HIM more sophisticated. Let the HIM know about the partitions.

For example, if the SHR was a single instance OpenMRS, then that SHR would in fact know nothing about the other SHR’s or how patients are partitioned. It would just do its job on the patients that are assigned to it.

The actual partitioning logic should be so so so simple, ( ie, just a call to the “whatParition” function offered up by the core functionality CentralNode.whatPartition(aPatient)) that the HIM would easily do that, or even a client “might do it” (if we wanted to expose that aspect to the outside world, which I would try to avoid,.)

If we think of the SHR as simple, as it gives us flexibility in use. We could use OpenMRS, or perhaps others, out of the box as SHRs.]

From: rg.crichton@gmail.com [mailto:rg.crichton@gmail.com] On Behalf Of Ryan
Sent: Wednesday, April 24, 2013 2:49 AM
To: Derek Ritz (ecGroup)
Cc: Tucker, Mark; openhie-shr
Subject: Re: Horizontal Scalability

Hi all,

So in terms of the requirements I think what we are really all saying is that being able to distribute horizontally is a feature that wee would want for a SHR. This seems fairly obvious. This think the implementation of that would depend on the technologies at play in the final solution.

Derek, you raise an interesting point, should the HIM be aware of the multiple distributed instances? My initial though is no. That should be something abstracted away by the SHR such that adding additional nodes can be transparent to the HIM. The SHR should handle the complexity of distributing the load between its nodes. It would be interesting to hear what other have to say about this.

Cheers,

Ryan

On Mon, Apr 22, 2013 at 7:33 PM, Derek Ritz (ecGroup) derek.ritz@ecgroupinc.com wrote:

Hi all.

A piece of happy news is that, if we don’t want to “roll our own” in terms of how scale-out happens, a number of commercial and open source database offerings automatigically scale out to multiple parallel instances. We could leave the “how” up to implementers, who could always just pick one of those.

It is interesting and useful, however, to contemplate where we expect scaling should “happen”. Does the HIM layer have to know where all the SHRs are, or does the SHR “service” abstract multiple physical SHRs to the HIM? How we do this may also depend on what will be used as our interface to communicate with the SHR. For example, are we using XDS, which automatically supports federated XDS repositories – or are we using something proprietary that perhaps forces us to “roll our own”?

Food for thought, and discussion, I hope… J

DJ


Derek Ritz, P.Eng., CPHIMS-CA

ecGroup Inc.

+1 (905) 515-0045

www.ecgroupinc.com

This communication is intended only for the party to whom it is addressed, and may contain information which is privileged or confidential. Any other delivery, distribution, copying or disclosure is strictly prohibited and is not a waiver of privilege or confidentiality. If you have received this telecommunication in error, please notify the sender immediately by return electronic mail and destroy the message and any attachments.

Le présent courriel et les documents qui y sont joints sont confidentiels et protégés et s’adressent exclusivement au destinataire mentionné ci-dessus. L’expéditeur ne renonce pas aux droits et privilèges qui s’y rapportent ni à leur caractère confidentiel. Toute prise de connaissance, diffusion, utilisation ou reproduction de ce message ou des documents qui y sont joints, ainsi que des renseignements que chacun contient, par une personne autre que le destinataire prévu est interdite. Si vous recevez ce courriel par erreur, veuillez le détruire immédiatement et m’en informer.

From: openhie-shr@googlegroups.com [mailto:openhie-shr@googlegroups.com] On Behalf Of Tucker, Mark
Sent: April 22, 2013 12:40 PM
To: openhie-shr
Subject: RE: Horizontal Scalability

The fun thing about this kind of horizontal scalability is that it does not require anything tricky in each DB node.

They will not synchronize between themselves.

You can imagine that the HIM, on receipt of a message, does a MRN->Node# lookup.

Once we pay for a simple lookup (instead of a computation on an MRN), then MRN’s can be distributed to nodes arbitrarily.

Ie, start with 8 nodes, and add new patients to node 9. when node 9 gets busy, create a new node, and start allocating to node 10.

NOTE: If we believe that our on-checkin-patient HL7 download is “complete” (That is, all clinical data can be transmitted “authentically” in the HL7 download), then migrating a patient from node to node is as easy as

[1] generate full dump of Patient 3 on Node#5

[2] load full dump into Node#6.

From: rg.crichton@gmail.com [mailto:rg.crichton@gmail.com] On Behalf Of Ryan
Sent: Monday, April 22, 2013 5:40 AM
To: Hannes Venter
Cc: Tucker, Mark; openhie-shr
Subject: Re: Horizontal Scalability

The notion of horizontal scalability is a great one. Being able to more SHR nodes dynamically would be ideal as then we could scale out as needed much more easily. To generalise, we don’t need to focus too much on the implementation of the distribution of nodes, but rather note that it is a important factor to consider in the design of a SHR.

Cheers,

Ryan

On Mon, Apr 22, 2013 at 10:33 AM, Hannes Venter hannes@jembi.org wrote:

Hi Mark,

Yes, this could work well: Mod10 scaling is a very elegant idea.

And I absolutely support the idea of scaling in parallel.

The only problem I can see is that it’ll be difficult to scale on an ad-hoc basis,

i.e. it’ll be difficult to add extra nodes to the cluster after the system is already running.

The reason is that if we were figuring out which node a patient’s data should

go into by modding the MRN by X, where X is the current cluster size, we can’t increase X

without affecting existing patient data.

I.e. if a patient was allocated to node # mod X, and we increase X with Y nodes,

then they should now be allocated to node # mod X+Y.

Meaning that all existing records would need to be moved to their new nodes.

It could of course be done, but might mean a significant down-time for the system,

esp. with large numbers of records. In fact it might not even be practical with very large numbers of records.

(It’s basically the same issue as dynamic allocation in a dictionary/hashtable data structure,

if you resize underlying array, you need to perform a rehash and re-allocate all the elements

from the old array to the new one (http://en.wikipedia.org/wiki/Hashtable#Dynamic_resizing)).

Of course I’m not trying to imply that this is a prohibitive issue,

we just need to be aware that if we go this route, we might have to accept this limitation.

Does anybody have other ideas about how to address this?

Does it even need to be addressed, or could we be happy with

always having, say, 10 nodes? (Certainly having 10 nodes that can’t scale adhoc

is much better than 1 OpenMRS instance that can’t scale adhoc)

Cheers

Hannes

On Fri, Apr 19, 2013 at 10:12 PM, Tucker, Mark mtucker2@regenstrief.org wrote:

If we can do “it” in parallel, we don’t have to worry so much about scaling OpenMRS to millions of patients.

Mark Tucker

Systems Engineer

Regenstrief Institute

(317)423-5552

mtucker2@regenstrief.org

Confidentiality Notice: The contents of this message and any files transmitted with it may contain confidential and/or privileged information and are intended solely for the use of the named addressee(s). Additionally, the information contained herein may have been disclosed to you from medical records with confidentiality protected by federal and state laws. Federal regulations and State laws prohibit you from making further disclosure of such information without the specific written consent of the person to whom the information pertains or as otherwise permitted by such regulations. A general authorization for the release of medical or other information is not sufficient for this purpose.

If you have received this message in error, please notify the sender by return e-mail and delete the original message. Any retention, disclosure, copying, distribution or use of this information by anyone other than the intended recipient is strictly prohibited


You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.
To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Hannes Venter
Software Developer, Jembi Health Systems | SOUTH AFRICA
Mobile: +27 73 276 2848 | Office: +27 21 701 0939 | Skype: venter.johannes
E-mail: hannes@jembi.org


You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.
To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ryan Crichton

Senior Software Developer, Jembi Health Systems | SOUTH AFRICA

Mobile: +27845829934 | Skype: ryan.graham.crichton
E-mail: ryan@jembi.org


You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.
To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.
To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ryan Crichton

Senior Software Developer, Jembi Health Systems | SOUTH AFRICA

Mobile: +27845829934 | Skype: ryan.graham.crichton
E-mail: ryan@jembi.org


You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.
To unsubscribe from this group and stop receiving emails from it, send an email to openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hmm.

I am persuaded. I made my “SHR” to small.

The better mental model is that the “SHR” is the piece of the system that stores the patient data (in the three forms, and with the prescribed codes), and which
knows how to send it back out.

One implementation could be a super-high performance clustered SQL database. Another might be a bunch of parallel OpenMRS systems. But those are *** implementation
details***, and the reference model should be:

·
Data(forPatient: P)
à in

·
Data(forPatient: P) ← out

Carl Fourie carl@jembi.org

I’m divided on this idea. while the abstracting of the “load balancing” and “scaling” of the SHR is a concept I agree with I’m not very comfortable with this sitting in the HIM – effectively pushing the responsibility of the scaling of
the SHR to the HIM?

Would it not make more sense for Ryan’s approach where the SHR would be responsible for its own scaling and rather have a load balancing mechanism associated to it?

Regards,

Carl

···

From: openhie-shr@googlegroups.com [mailto:openhie-shr@googlegroups.com]
On Behalf Of Derek Ritz (ecGroup)
Sent: Wednesday, April 24, 2013 9:50 AM
To: ‘openhie-shr’
Subject: RE: Horizontal Scalability

I must admit – I favour an SHR that knows how to manage its data, including if that management must cross multiple instances in order to support scaling out.
There is a wide array of products available that already know how to do that, including OpenMRS (if deployed against a MySQL cluster, for example). My sense is that this will usefully simplify things for us – including, perhaps, addressing the resync issue
we’re trying to solve right now between CR merge/split and the SHR (just as one example).

As an overarching premise, I believe that we should be thinking of OpenHIE as a reference implementation more than as an implementation. That is to say – when
one deploys a system there are times when (for various reasons, some of the them good… some not so much) one finds oneself making a “deal with the devil” and doing something expedient. But a reference implementation is where we have an opportunity to instantiate
an exemplar of the design we SHOULD follow, whether or not we always do when it comes time to actually implement.

My $0.02…

DJ

**Derek Ritz,**P.Eng., CPHIMS-CA

ecGroup Inc.

+1 (905) 515-0045

www.ecgroupinc.com

This communication is intended only for the party to whom it is addressed, and may contain information which is privileged or confidential. Any other
delivery, distribution, copying or disclosure is strictly prohibited and is not a waiver of privilege or confidentiality. If you have received this telecommunication in error, please notify the sender immediately by return electronic mail and destroy the message
and any attachments.


Le présent courriel et les documents qui y sont joints sont confidentiels et protégés et s’adressent exclusivement au destinataire mentionné ci-dessus. L’expéditeur ne
renonce pas aux droits et privilèges qui s’y rapportent ni à leur caractère confidentiel. Toute prise de connaissance, diffusion, utilisation ou reproduction de ce message ou des documents qui y sont joints, ainsi que des renseignements que chacun contient,
par une personne autre que le destinataire prévu est interdite. Si vous recevez ce courriel par erreur, veuillez le détruire immédiatement et m’en informer.

From:
openhie-shr@googlegroups.com [mailto:openhie-shr@googlegroups.com]
On Behalf Of Tucker, Mark
Sent: April 24, 2013 9:27 AM
To: openhie-shr
Subject: RE: Horizontal Scalability

On this issue, I actually favor making the SHR simpler, and the HIM more sophisticated. Let the HIM know about the partitions.

For example, if the SHR was a single instance OpenMRS, then that SHR would in fact know nothing about the other SHR’s or how patients are partitioned. It
would just do its job on the patients that are assigned to it.

The actual partitioning logic should be so so so simple, ( ie, just a call to the “whatParition” function offered up by the core functionality CentralNode.whatPartition(aPatient))
that the HIM would easily do that, or even a client “might do it” (if we wanted to expose that aspect to the outside world, which I would try to avoid,.)

If we think of the SHR as simple, as it gives us flexibility in use. We could use OpenMRS, or perhaps others, out of the box as SHRs.]

From:
rg.crichton@gmail.com [mailto:rg.crichton@gmail.com]
On Behalf Of Ryan
Sent: Wednesday, April 24, 2013 2:49 AM
To: Derek Ritz (ecGroup)
Cc: Tucker, Mark; openhie-shr
Subject: Re: Horizontal Scalability

Hi all,

So in terms of the requirements I think what we are really all saying is that being able to distribute horizontally is a feature that wee would want for a SHR. This seems fairly obvious. This think the implementation of that would depend
on the technologies at play in the final solution.

Derek, you raise an interesting point, should the HIM be aware of the multiple distributed instances? My initial though is no. That should be something abstracted away by the SHR such that adding additional nodes can be transparent to the
HIM. The SHR should handle the complexity of distributing the load between its nodes. It would be interesting to hear what other have to say about this.

Cheers,

Ryan

On Mon, Apr 22, 2013 at 7:33 PM, Derek Ritz (ecGroup) derek.ritz@ecgroupinc.com wrote:

Hi all.

A piece of happy news is that, if we don’t want to “roll our own” in terms of how scale-out happens,
a number of commercial and open source database offerings automatigically scale out to multiple parallel instances. We could leave the “how” up to implementers, who could always just pick one of those.

It is interesting and useful, however, to contemplate where we expect scaling should “happen”. Does
the HIM layer have to know where all the SHRs are, or does the SHR “service” abstract multiple physical SHRs to the HIM? How we do this may also depend on what will be used as our interface to communicate with the SHR. For example, are we using XDS, which
automatically supports federated XDS repositories – or are we using something proprietary that perhaps forces us to “roll our own”?

Food for thought, and discussion, I hope…
J

DJ


**Derek Ritz,**P.Eng., CPHIMS-CA

ecGroup Inc.

+1 (905) 515-0045

www.ecgroupinc.com

This communication is intended only for the party to whom it is addressed, and may contain
information which is privileged or confidential. Any other delivery, distribution, copying or disclosure is strictly prohibited and is not a waiver of privilege or confidentiality. If you have received this telecommunication in error, please notify the sender
immediately by return electronic mail and destroy the message and any attachments.


Le présent courriel et les documents qui y sont joints sont confidentiels et protégés et s’adressent exclusivement au destinataire mentionné ci-dessus. L’expéditeur ne
renonce pas aux droits et privilèges qui s’y rapportent ni à leur caractère confidentiel. Toute prise de connaissance, diffusion, utilisation ou reproduction de ce message ou des documents qui y sont joints, ainsi que des renseignements que chacun contient,
par une personne autre que le destinataire prévu est interdite. Si vous recevez ce courriel par erreur, veuillez le détruire immédiatement et m’en informer.

From:
openhie-shr@googlegroups.com [mailto:openhie-shr@googlegroups.com]
On Behalf Of Tucker, Mark
Sent: April 22, 2013 12:40 PM
To: openhie-shr
Subject: RE: Horizontal Scalability

The fun thing about this kind of horizontal scalability is that it does not require anything tricky
in each DB node.

They will not synchronize between themselves.

You can imagine that the HIM, on receipt of a message, does a MRN->Node# lookup.

Once we pay for a simple lookup (instead of a computation on an MRN), then MRN’s can be distributed
to nodes arbitrarily.

Ie, start with 8 nodes, and add new patients to node 9. when node 9 gets busy, create a new node,
and start allocating to node 10.

NOTE: If we believe that our on-checkin-patient HL7 download is “complete” (That is, all clinical
data can be transmitted “authentically” in the HL7 download), then migrating a patient from node to node is as easy as

[1] generate full dump of Patient 3 on Node#5

[2] load full dump into Node#6.

From:
rg.crichton@gmail.com [mailto:rg.crichton@gmail.com]
On Behalf Of Ryan
Sent: Monday, April 22, 2013 5:40 AM
To: Hannes Venter
Cc: Tucker, Mark; openhie-shr
Subject: Re: Horizontal Scalability

The notion of horizontal scalability is a great one. Being able to more SHR nodes dynamically would be ideal as then we could scale out as needed much more easily. To generalise,
we don’t need to focus too much on the implementation of the distribution of nodes, but rather note that it is a important factor to consider in the design of a SHR.

Cheers,

Ryan

On Mon, Apr 22, 2013 at 10:33 AM, Hannes Venter hannes@jembi.org wrote:

Hi Mark,

Yes, this could work well: Mod10 scaling is a very elegant idea.

And I absolutely support the idea of scaling in parallel.

The only problem I can see is that it’ll be difficult to scale on an ad-hoc basis,

i.e. it’ll be difficult to add extra nodes to the cluster after the system is already running.

The reason is that if we were figuring out which node a patient’s data should

go into by modding the MRN by X, where X is the current cluster size, we can’t increase X

without affecting existing patient data.

I.e. if a patient was allocated to node # mod X, and we increase X with Y nodes,

then they should now be allocated to node # mod X+Y.

Meaning that all existing records would need to be moved to their new nodes.

It could of course be done, but might mean a significant down-time for the system,

esp. with large numbers of records. In fact it might not even be practical with very large numbers of records.

(It’s basically the same issue as dynamic allocation in a dictionary/hashtable data structure,

if you resize underlying array, you need to perform a rehash and re-allocate all the elements

from the old array to the new one (http://en.wikipedia.org/wiki/Hashtable#Dynamic_resizing)).

Of course I’m not trying to imply that this is a prohibitive issue,

we just need to be aware that if we go this route, we might have to accept this limitation.

Does anybody have other ideas about how to address this?

Does it even need to be addressed, or could we be happy with

always having, say, 10 nodes? (Certainly having 10 nodes that can’t scale adhoc

is much better than 1 OpenMRS instance that can’t scale adhoc)

Cheers

Hannes

On Fri, Apr 19, 2013 at 10:12 PM, Tucker, Mark mtucker2@regenstrief.org wrote:

If we can do “it” in parallel, we don’t have to worry so much about scaling OpenMRS to millions of patients.

Mark Tucker

Systems Engineer

Regenstrief Institute

(317)423-5552

mtucker2@regenstrief.org

Confidentiality Notice: The contents of this message and any files transmitted with it may contain confidential
and/or privileged information and are intended solely for the use of the named addressee(s). Additionally, the information contained herein may have been disclosed to you from medical records with confidentiality protected by federal and state laws. Federal
regulations and State laws prohibit you from making further disclosure of such information without the specific written consent of the person to whom the information pertains or as otherwise permitted by such regulations. A general authorization for the release
of medical or other information is not sufficient for this purpose.

If you have received this message in error, please notify the sender by return e-mail and delete the original message. Any retention, disclosure, copying, distribution or use of this information by anyone other than the intended recipient is strictly prohibited

You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.

To unsubscribe from this group and stop receiving emails from it, send an email to
openhie-shr+unsubscribe@googlegroups.com.

For more options, visit
https://groups.google.com/groups/opt_out
.


Hannes Venter
Software Developer, Jembi Health Systems | SOUTH AFRICA
Mobile: +27 73 276 2848 | Office:
+27 21 701 0939 | Skype: venter.johannes
E-mail: hannes@jembi.org

You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.
To unsubscribe from this group and stop receiving emails from it, send an email to
openhie-shr+unsubscribe@googlegroups.com.
For more options, visit
https://groups.google.com/groups/opt_out
.

Ryan Crichton

Senior Software Developer, Jembi Health Systems | SOUTH AFRICA

Mobile:
+27845829934 | Skype: ryan.graham.crichton
E-mail: ryan@jembi.org

You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.

To unsubscribe from this group and stop receiving emails from it, send an email to
openhie-shr+unsubscribe@googlegroups.com.

For more options, visit
https://groups.google.com/groups/opt_out
.


You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.
To unsubscribe from this group and stop receiving emails from it, send an email to
openhie-shr+unsubscribe@googlegroups.com.
For more options, visit
https://groups.google.com/groups/opt_out
.

Ryan Crichton

Senior Software Developer, Jembi Health Systems | SOUTH AFRICA

Mobile: +27845829934 | Skype: ryan.graham.crichton
E-mail: ryan@jembi.org


You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.
To unsubscribe from this group and stop receiving emails from it, send an email to
openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


You received this message because you are subscribed to the Google Groups “Shared Health Record (OpenHIE)” group.
To unsubscribe from this group and stop receiving emails from it, send an email to
openhie-shr+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.