Updated Architecture: Cluster Multi Process (markdown)

TJKoury 2016-11-08 12:39:23 -05:00
parent 068470265c
commit fdb58aa250
1 changed files with 22 additions and 10 deletions

@ -2,16 +2,34 @@
***
##Multi-Process Model
#Multi-Process Model
The multi-process version of Node-RED starts the admin server, interface, and API on the Master process, then spawns Workers that run the flows independently of the Master process.
##Overview
This Node-Red fork uses the native Node-Red cluster module for multi-processing.
###Design Notes
1. A master process is spun up that reads the setting file and initializes at least 1 worker process
2. Each worker process is running the full version of the node-red runtime, including the admin interface
3. As per the base Node-Red configuration, http / ws nodes use the admin server
3. The master process assigns a 'bingo' worker that is responsible for handling input / event nodes that do not have inputs. This includes nodes with persistent connections with external resources (mqtt, twitter, etc).
4. Every message that is sent from a node goes through a message broker called in the Node.send method
5. If there is only one worker process, all message passing works as usual
6. If not, it evaluates the sender and target node, and routes messages accordingly. All nodes can have a .cluster property which tells Node-RED how to handle them. If it is missing, Node-RED makes an assumption based on worker type and number of inputs into the node
7. When a new flow is created and deployed, the receiving worker emits a message to all workers to reload the flows from storage
8. The new context system enables adding different persistence sources for contexts. The default is an in-memory datastore in the master process, which is shared by all worker processes
9. If a context is initialized with the 'shared' parameter equal to true, the .get method on the returned object requests an optional allback, allowing for async storage.
Communication from the Worker processes to the Master is handled via IPC.
##Files
| File | Description |
|:-----|:------------|
|clusterRED.js| In charge of general process management. Returns a 'clusterRED' object. Has methods to initialize master and Worker processes, spawn and kill new processes, provide status on child processes. |
|/cluster/index.js| In charge of general process management. Returns a 'clusterRED' object. Has methods to initialize master and Worker processes, spawn and kill new processes, provide status on child processes. |
|/cluster/master.js| Initialization code for worker process. Spawns workers, attaches events to master cluster process. |
|/cluster/masterEvents.js | Attaches events to the worker process objects in the master process. Handles messages that are emitted by worker processes. |
|/cluster/worker.js | Attaches events to the worker process objects in the master process. Handles messages that are emitted by worker processes. |
|cluster/workerEvents.js | Attaches events to the worker processes. Handles messages emitted from the master process. |
|red.js| Controls instantiation of admin server, api , core admin UI, and static server.|
|red/runtime/events.js| Modification to the runtime event bus. Sends messages from Worker to Master, which are then broadcast to all other Workers. Messages from master are [marked](https://github.com/TJKoury/node-red/blob/cluster/clusterRED.js#L134) so as not to cause a rebroadcast cascade.|
|red/api/comms.js| Enables websocket comms bus if on Master process.|
@ -23,12 +41,6 @@ Communication from the Worker processes to the Master is handled via IPC.
|red/runtime/storage/localfilesystem.js|Modified to only allow Master to log.|
|settings.js|Modified to expose the cluster module in function nodes, also provides the default setting for number of Workers.|
##Issues
- [ ] **Clustering Admin Interface**. To make it practical to keep the default behavior of using the admin httpServer as the primary interface for http / websocket nodes, the entire admin interface must be used in cluster mode. Running a server on the master process restricts the amount of data that can be serialized/deserialized in IPC as multiple workers are handling the requests that originate and end in the master process. **However**, running the admin interface in worker threads also poses it's own problems. As any worker process might be the one that a web client connects to, all messages headed to the browser must be serialized and sent to all workers. This will cause a massive increase (n * workers -1) in the amount of data handled to inform the admin interface, from node registry events to node status events, debugging, etc.
- [ ] **Modified nodes**. As nodes are capable of binding ports, listening to file-system events, spawning other processes, etc., simply clustering by forking can lead to serious issues. There appear to be two options: modifying the nodes to play nice, or build in strict limitations on what nodes can do.
- [ ] **Config node capabilities**. Currently, config nodes (and certain nodes like TCP) are allowed to bind ports, meaning that if flows are loaded equally in the Master and child processes, the Master process will listen on that port, and then block the port from being [bound specifically to distribute incoming to Workers](https://github.com/nodejs/node/blob/master/lib/cluster.js#L116). Some nodes might be run only in the Master (inject) and [serve events to Workers based on a round-robin](https://github.com/TJKoury/node-red/blob/cluster/clusterRED.js#L88). Limiting config nodes to run only on Worker processes, then moving any code that could cause issues into a config node, could be a standardization solution to this issue.
- [ ] **Static server**. Run single-processed on Master, or instantiate new static path on each clustered http-server instance? Does this option reside in the http-server config node menu, along with a static path?
##General Notes