View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
355 | Composr | core | public | 2011-12-13 00:37 | 2024-08-04 18:57 |
Reporter | Chris Graham | Assigned To | Chris Graham | ||
Priority | normal | Severity | feature | ||
Status | closed | Resolution | fixed | ||
Summary | 355: MVC rewrite | ||||
Description | Much was done in the old 'v8' branch, that will need reimplementing again. Changes under 'Additional information'. | ||||
Additional Information | The 'sources' directory structure has changed: basically split into 'controllers', 'content_models' and 'sys_models' Something similar to 'dependency injection' is now achieved via a new vtable-based object dispatch architecture require_code calls are no longer required - it is built into the object dispatch architecture Direct object 'new' instantiation is no longer allowed, instead you have to use the object dispatch architecture 'sources' files are now simply 'classes'. They are instantiated like singletons, hooked into a singleton vtable for re-use; but without the disadvantage of singletons because the call tree allows them to be swapped out as required Most global variables have been eliminated, replaced with static variables or object properties Exceptions are now used instead of triggering exits mid-code There is a much stronger MVC architecture enforced Now only 'model' objects are allowed to interface with the data store, so that we know we can always plug&play model implementations and guarantee writes are passed through an interface that we can track for revision control The override mechanism is now object orientated, rather than working via code rewriting. The object dispatch architecture recognises a naming convention for override classes. The code-rewriting override mechanism function naming convention has been changed | ||||
Tags | Risk: Major rearchitecting | ||||
Attach Tags | |||||
Attached Files | |||||
Time estimation (hours) | 40 | ||||
Sponsorship open | |||||
|
Attached is the bulk of the work done in the old 'v8' branch, including scripts to recreate many of the changes. A lot should be achievable by running those scripts, and then doing diffing and analysis to see what other kinds of changes (particularly, how the classes are designed) - should take much less time to reproduce the work than it originally did to do. |
|
We might kill this idea off, in favour of focusing on extreme performance. Unlike Composr competitor's, Composr is developed by a small team. We don't need to focus on clearly defined component contracts so much, and performance is such an issue nowadays. Moore's law is broken (as it is understood, in terms of clock speed), computer threads aren't really getting faster, yet webapps are increasingly expected to replace desktop apps. Increasing object-orientation is bound to hurt performance (function calls and objects are expensive in PHP), when we should be improving it. That is a highly controversial thing to say, but it is the same view as the original PHP creator (Rasmus Lerdorf) has - quick and simple. I don't believe quick and simple has to be messy, and that complex object orientated hierarchies can be overly verbose, ham-strung, and slow - five times the amount of code to do the same thing just gives you clarity over definition by taking away conciseness, performance, and development speed. We care a huge amount about code standards - we wrote our own code lint to make sure we stick to them (probably no other PHP project has done that) - but some developers get far too anal about OOP and are blind to the wider engineering concerns we all face. So rather than doing this I'd rather we focused on it differently. I'd rather 'sanctify' certain global data, and categorise them better. Perhaps we will group sets of global variables into singleton objects, with a concept of push/pop state on them. Particularly I am thinking of a RunContext (what page is being executed, what parameters it has, what is the active script, etc) and an OutputContext (what CSS files are needed, etc). |
|
A lot of work has been done in the v10 improving the architecture, but has gone in a drastically different direction to discussed here, and as was partly implemented before. The changes proposed here are like other projects have been doing, but really its not a good idea. This is a very controversial but very defensible position (which will be documented in the Code Book). Where we're going to is more of an evolution of our current architecture, and that is for reasons of performance and simplicity. The main issue here summarises down to two main points... 1) Strict MVC and stronger object classes Composr always had MVC, but we don't explicitly reference that terminology. Rather than clumsy making 'fat' modal objects for each data type (i.e. the very object orientated approach to MVC), the released v8 had what you could call a 'reflection' API. It tied different content APIs together and provided a management API that allowed automatic reasonings and abstractions to occur. 2) Avoiding globals and introducing dependency injection PHP was created to not be anything like Java, and many PHP programmers nowadays who learnt Java in university don't know any better. There's also a certain degree of Java envy and loathing against the way the PHP language was designed. But really, PHP is great because it's designed so that requests can be incredibly simple, high-performant, and firewalled from each other. On that logic, we want to make our requests directly serve the user based on received web parameters, not to try to make some super abstract server that always worries about loading up multiple execution instances in a single request with different contexts. So we optimise for the norm, slipstreaming the obvious and common code paths with minimal architectural verbosity. We now in v10 will supplement that with context-switch APIs that work on a stack - you can push and pop global execution-context and output-state. Stacks work great with recursion, and this allows us to achieve everything dependency injection can, but with far less overhead and complexity - and any complexity is positioned only where the requests dictate a need for it, not for consideration every time we need to do something simple. Code annotations have been added to show what functions and variables affect output state. Any global variable used by more than 5 .php files has been annotated with a description of its behaviour. Most of the global variables have been renamed with a new naming convention, most ending '_CACHE' to signify they are only there for run-time caching. Most of the globals used by many different parts of the code are no longer accessed directly - new API functions have been added to connect to them, which is a lot neater. A unit test is added to complain if globals get out of control. As mentioned, I'm going to defend this position in the Code Book, because many programmers won't agree (IMO many of the loudest developers are good mathematicians but very poor engineers). I want Composr to be simple and performant, and unique, not a monstrosity of plumbing trying to connect things together by way too much complex and verbose syntax. |
|
Here's my long rant lol, to defend against zealots who might want us to do it the wrong way ;-)... [title="2"]Why not more object-orientation?[/title] Most PHP CMSs have gone in the direction of having very sophisticated object systems, with dependency injection, design patterns and namespaces. Composr however doesn't use this kind of stuff, and breaks what some people would consider best practices. [i]This is entirely intentional[/i] and we want to explain why, because there are a lot of people out there with strong opinions on how to design a web system and we realise our approach is not the conventional one that a lot of people are pushing. We want to explain why we think we have built a far more productive environment in Composr. This will seem like a bit of a wild rant, but really we've gone into detail to hopefully shine a light on the Composr design decisions. It's a matter of good engineering to consider all factors when coming up with a design. Here are some design considerations, in rough order of decreasing importance: 1) Ease and efficiency of coding for the framework 2) Performance of live code 3) Ease of understanding and maintaining the framework 4) Ability to automatically test 5) Strict separation of concerns to stop developers interfacing incorrectly with functionality 6) Ability to create complex abstractions and contextual shifts when dealing with special scenarios -- for example, test runs and simulations '1' is so important. We feel that with OOP-heavy frameworks, where you have to worry about relatively complex initialisations of every part of the system you want to touch, and how you are going to plumb all the different components together, it really makes the code take a lot longer to write due to the extra verbosity. In fact, 66% of code can easily be taken up just by extra plumbing, and this really hurts when you're trying to get ROI on your time investment. Yes, extra plumbing is not that hard, but it is time consuming to do and maintain and to read through. Regarding '2', method calls are expensive in PHP, and variable storage uses a lot more memory than you think, as does the memory simply to load up PHP code. So, there really is a high cost in all the plumbing you might want to do -- 66% more code could certainly mean 40% more memory usage, and 30% slower CPU performance. Users increasingly expect sites to respond extremely fast, yet CPU speeds have not really increased in a decade, and PHP is not getting any faster either. On top of all this, frameworks also mean you need to load a lot more code than you need, due to having to load up structures that are more complete and complex than you strictly need to serve the simple requests you are normally serving. Most OOP-heavy framework users respond by slapping on a full-page cache, but this is very limiting because then you really block the social interactivity you really want from a system like Composr. Regarding '3', OOP-heavy architectures try to abstract away details to make things simpler, but in the process of defining custom interfaces for all modes of interaction, and plumbing to connect all the objects together, you end up with a huge amount of new interface complexity to both understand and maintain. We think this really is a zero-sum game. There's a really important design principle in the field of agile programming called 'You don't need it' (YDNI), and OOP-heavy architectures often go against this by trying to plan out for all access scenarios upfront and abstract them perfectly -- it's much better that we don't waste resources (both programmer time, and system resources) until we need to make a particular access pattern work, and then we can optimise it for how we need it. Apart from cost/schedule, YDNI is also important because you can't predict future requirements well -- trying to anticipate too much by putting in too much structure between components actually works against you rather than for you, giving you into patterns you actually would not have wanted. In other words, too much structure costs a lot to make and can tie you down as much as it can guide you. '4' is definitely something OOP-heavy coding helps with. However, you can achieve equivalent things in simpler designs also. '5' is another win for OOP-heavy systems, but it does kind of assume you are employing code monkeys, and that is never going to work however much you try and mollycoddle them. '6' is something OOP-heavy systems do very well, but read on. As you can see, simpler designs solve the most important concerns best, and then remaining concerns are well served by OOP-heavy designs. However, our Composr design is not actually a naive simple design at all -- it's actually very carefully structured, and has a high degree of strictness/discipline in many areas. Composr is definitely not anti-OOP by any means, it is only anti-over-engineering -- we actually rely on the strengths of OOP in many key areas of the system. Here's how our approach works to achieve the commonly-cited OOP-heavy advantages (flexibility, maintainability, clarity, and stability) without slowing development, complicating code, or hurting performance: - We can achieve the equivalence to dependency injection as our output and execution context states support a stack-based switching mechanism, allowing us to create sandboxes very effectively for those few cases where it is useful. The critical difference is we achieve it in far less code, with better performance, in a way that is easier to understand, and where the burden for it is only felt when you actually need this advanced functionality, rather every time you use any component of the system. - We can achieve inheritance via our code overrides system, which also provides the ability for users to override components without having to get down and dirty with coding. Different versions of code can be placed and tested as overrides, and new functionality can be placed using overrides. This does not suffer from the same problems as copy and paste coding, as overrides are designed to work on the function level, via some clever code we have in place, and we can even make line-by-line changes should we need to, or extend individual functions. - We do have unit tests to test critical functionality. Admittedly these are more prone to break if we start redesigning stuff without confirming the interfaces used by the unit tests are still appropriate, but the difference is really not that much. - We keep a very tight rein on global variables that we define. There is a formal naming scheme, and we have an automated test that scans the codebase and verifies that we don't start accessing any of them all over the place. Generally a global belongs in the file it is defined in, and from this point of view the source code files are like classes -- the globals in them are their private properties. In essence, rather than enforcing this through verbose code plumbing and memory consumption, we automatically enforce it through automated testing for compliance to our coding standards. If a global is allowed for multiple access in different places (usually for performance), then we have a rule that it has to be given a formal annotation to what it does. - We have the ability (via APIs and simple glue objects) to 'reflect' on content types within Composr in a really abstract way, even though the content types aren't defined by "fat objects". This provides the same power that rigid conformability allows, but without hurting performance or actually requiring rigid conformability. The Composr design is actually based on how PHP is supposed to work, rather than Java-envy. Critics often think of PHP as inelegant, and it is true there are a lot of consistencies in the language and that some old legacy PHP features (that we don't use) were very poorly designed -- but PHP is also widely popular because it is cost effective and easy to use, and this has been proven with top sites like Facebook using it. PHP is built on the concept that web requests should be simple and fast -- they should get straight to the point and get what is needed done, and this is why PHP runs each request in its own 'sandbox' rather than as a thread on an application server. Web requests start up, do what they need to do, then are killed off. We therefore design Composr around the idea that these web requests are simple and fast, rather than trying to build a castle in the sky for a process that will terminate anyway as soon as its finished sending a result back. PHP environment access is based around 'super globals', just like Composr's access to key interfaces is -- like the database connections. However, we improve super globals by adding the stack based context switching functionality described earlier. So... hopefully you can now see past the anal arguments some people make that everything should be designed a certain way -- an expensive, complex, way, that performs really poorly in the name of a theoretical but unrealisable maintainability. Don't believe any methodology's hype without doing a good analysis first -- maintainability really comes down to the efforts the team does in keeping a codebase tidy and consistent, not much else. As good engineers we also know we want a framework that provides the most efficient possible coding method, in addition to the robustness, flexibility, and maintainability that we need. Knuckling down and over-engineering everything is going to make coding progress very slow/expensive and probably going to make code less maintainable rather than more maintainable -- "simple by default" is a better plan if implemented with care and maintained with agility. Rant over ;). |
Date Modified | Username | Field | Change |
---|---|---|---|
2016-06-08 00:14 | Chris Graham | Tag Renamed | Major rearchitecting => Risk: Major rearchitecting |
2024-08-04 18:57 | Chris Graham | Relationship added | related to 5838 |