#3147 - Review of cloud filesystem support

Identifier #3147
Issue type Feature request or suggestion
Title Review of cloud filesystem support
Status Completed
Tags

ocProducts client-work (likely) (custom)

Roadmap: v11 (custom)

Type: Cloudification (custom)

Type: Cross-cutting feature (custom)

Type: External dependency (custom)

Type: Performance (custom)

Handling member Chris Graham
Addon core
Description There are a few possible approaches to automatic synching of the filesystem on the cloud:
1) Mount the entire install on shared storage
2) Implement Composr's sync_file function, automatically detecting what change was done to a file then synching it out
3) Using a different subpath for all custom folders, mounting it under a path that is a shared storage mount (i.e. at the operating system level)
4) Using a different subpath for all custom folders, mounting it under a path that is a PHP file wrapper, and setting up so URLs under there are picked up by the Apache configuration too
5) rsync
6) Moving everything into the database
7) Use of an internal CDN transfer API instead of direct filesystem writing, with URLs generated according to that API (i.e. no direct correspondence between a URL and any particular file path)

It's tricky to know what to do, but we want something very architecturally clean and maintainable, not lots of different approaches needing expert configuration. If we define some design goals we can eliminate some approaches.

a) Files should be hostable on a CDN so that they may be served geographically close to the user. This will improve page load times.
b) Our CDN may not be able to host every kind of media (e.g. Cloudinary could not host non-images).
c) We need to be able to delete files.
d) It has to be reliable.
e) It has to be scalable.
f) It has to be easy to set up.
g) It can't bloat our code-base too much.
h) It has to be hard for a newbie Composr developer to forget to implement the functionality.
i) It cannot place unreasonable limitations on hardware architecture.
j) It has to have a wide compatibility with actual services people use.
k) It has to have a wide compatibility with actual web hosting people use.

We can therefore eliminate:
1 - This violates 'e' because it is a single bottleneck, and also 'i' because servers would need to be on the same cluster with a very high-performance I/O channel
2 - This violates 'h', developer's can easily forget to call sync_file (they can't if they're running ocProducts PHP, but they're probably not); it also violates 'f'
3 - This violates 'a', 'f', 'h', 'j' and 'k'
4 - This violates 'a' and 'h
5 - This probably wouldn't work at all, as rsync would not know the difference between a delete and a new file appearing on one particular server
6 - This violates 'a', 'i' and 'k' -- putting potentially GB of data into the database is not something we can reasonably expect the majority of users to accept
7 - This works, although will be a lot of work.

I think we should remove the concept of 'sync_file'. Nobody ever used it.

Then I think we need to implement '7', combined with '4'. That is we extend our current CDN transfer system so that CDN transfer hooks can accept control of any path/file-type combinations -- with a native PHP file access API using the PHP file wrappers functionality. CDN transfer hooks would sit behind our file wrapper. URLs would be converted via a conversion functions that go each way.
Steps to reproduce

Funded? No
The system will post a comment when this issue is modified (e.g., status changes). To be notified of this, click "Enable comment notifications".

Rating

Unrated