The Data Consultant's Lament

Enterprise data projects can take you on strange journeys. Good management, a sound data strategy, and yes, a cute cat picture can make sure you reach your destination, no matter how far out. GETTY

Enterprise data projects can take you on strange journeys. Good management, a sound data strategy, and yes, a cute cat picture can make sure you reach your destination, no matter how far out.

I wrapped up a project today, one that had been cut short because the company in question had a change in management and a corresponding shift in spending priorities. While I'm quite proud of what I produced given that, I also have the full expectation that the project will likely not see utilization, because there's not been enough time for testing, training or maintenance planning.

This is extraordinarily common in business - I've seen far too many projects that end their lives this way, not because they weren't worthwhile or even cost effective but because they were seen as easy line-item deletions when a new management team comes in or last quarter's numbers were bad.

Despite a whole industry of marketers who may claim otherwise, there is a hard reality: Software takes time to develop, test and deploy. Some of this has to do with the complexity of implementation, which is a function of the software developer, and a whole cottage industry of agile scrum-masters and similar pundits has emerged in order to make this phase as efficient as possible.

Yet the vast majority of project failures occur not because of bad programming practices but due to poor management practices. These show up over and over again, but perhaps because management has a stake in remaining management, all too often these failures become rendered as "mistakes were made", getting swept under the rug along with the contract development teams that were unlucky enough to be considered one of the mistakes.

There are several kinds of "mistakes" that are worth noting:


More data projects die of bad management decisions than bad programming. Understand your needs, your data strategies and enterprise governance can go a long way to ensuring your next project will be a success. GETTY

More data projects die of bad management decisions than bad programming. Understand your needs, your data strategies and enterprise governance can go a long way to ensuring your next project will be a success. GETTY

Do You Really Need That Data Project?

The IT marketing machine is very much a factor in management decisions. We all voraciously wait to see where we are in a given Magic Quadrant or whether a given tech is on the upslope of the hype curve or wallowing in the trough of despair. The latest electronic press touts the benefits of big data or mobile tech or artificial intelligence, with each claim about how the world is going to be transformed ever more magical than the last.

The reality is somewhat less exciting. Do you really need to hire a stable of data scientists and undergo a deep data mining operation? Is your business going to significantly improve if you analyze every error log in every database you have, and will the future be anymore revealed to you if you look at what's currently trending on Twitter? Are chatbots really the future of user interfaces?

Businesses should evaluate software based on four criteria -

  • will it help you produce the things that you sell more productively than you can do now?

  • Will it give you more clarity about future market behavior beyond what you can achieve now?

  • Will it make it easier to promote your product to that market than you can do now?

  • Will it provide a better view of the exact state of your business than you can do now?

If the software does not accomplish any of these, it is likely not worth investing in.

However, even these are broad buckets that can be further refined. Future prediction, for example, may sound like a winner, but all too often, once you get a lot of people in the same general market peering into the future based upon current trends, you end up with chaotic effects that can invalidate everything (cf. Bitcoin, which is currently undergoing extreme volatility). Futurists talk about separating the signal from the noise, but noise is itself signals of less likely futures. Acquiring more information often just amplifies the noise and increases the costs involved.

The same holds true with promoting your product. SEO is just the latest manifestation of this, in effect trying to game the system so that the indexes favor your company or product. Spending more money on better SEO solutions becomes an arms race with the search engine producers, who are attempting as much as possible to ensure that any bias in their rankings favors them. Spending more to improve a product, or at least to make your advertisements sufficiently of value to your customers to justify their attention seeking behavior, likely is going to generate better returns, but these all require human intervention, which all too many companies are loath to turn to as such solutions do not always scale well.

The same kinds of arguments can apply to the other two questions. In general, the ability of computers as augmentation tools has been well tested for decades, and it is here arguably, where the information age has shown the greatest benefits. However, the final point needs to be looked at very carefully, as it is where the justifications for data projects often begin.


Your enterprise data dashboards may look cool, but unless your data strategies are sound, they can be both expensive and misleading. GETTY

Your enterprise data dashboards may look cool, but unless your data strategies are sound, they can be both expensive and misleading. GETTY


One of the most commonly requested management-oriented applications is the dashboard. One of my brothers is a senior sergeant in the army, where he overseas helicopter maintenance, after spending a long career as a helicopter mechanic. Not surprisingly, he also works on cars as a hobby. He laments that good cars have been slowly losing the dials that indicate things like RPMs, oil pressure, and so forth in favor of warning lights that give far less information. He calls these new indicators idiot lights, and it's a term I've adopted for discussing data systems.

In recent years, not surprisingly in conjunction with the rise of "Big Data", a class of software called dashboards has become very much in demand in senior management circles. The idea of a dashboard is very much the same as the idiot lights on a car. It indicates when something is wrong without doing more than giving a very vague idea about what that something is. Dashboards and their lineal descendants, the executive chatbot, in theory, allow a very busy executive the ability to determine at a glance whether the company is in trouble or not, without actually needing to understand the shape of the data.

In practice, they do nothing of the sort. Most dashboards provide pretty information graphics - charts, graphs and maps, usually - that sit on top of incredibly complex queries that a team of data scientists spent weeks or months refining from existing databases, though the final data can generally fit just as readily in an Excel spreadsheet. The problem with such data is three-fold. First, most organizations do not have a solid data governance strategy, so the data that comes from various systems may be inconsistent, poorly interconnected and of questionable provenance.

Additionally, if the dashboards do not have precisely the right queries to make on the data itself, what the executive sees is quite possibly meaningless, and possibly highly misleading. Finally, breaking information down into red/yellow/green indicators (or similar widgets) reduces potentially complex information into overly simplistic forms. An executive may get an indication that something is wrong but have no idea what it is or how severe the problem really is.

Such dashboards also suck a lot of resources (coders, designers, analysts, systems, money) away from other projects that may return more value for the buck.

This is not to say that a broad enterprise data governance strategy is even remotely a bad idea. However, the goal is to determine what enterprise data is in fact enterprise in scope, build out with a comprehensive strategy to make that data as clean, unambiguous and reliable as possible, and only then to build out extensions (such as dashboards, enterprise chatbots and modular addons for platforms) to get the data that decision-makers need.


Champions are important to the success of a project. Just make sure that they're not about to leap elsewhere.   GETTY

Champions are important to the success of a project. Just make sure that they're not about to leap elsewhere. GETTY


Management Changes

Most projects can only be accomplished within an organization if there is a champion to keep that project alive. If that champion gets promoted, his or her clout may grow, but the mandate that person has on the original project shrinks, If the champion leaves (or worse, is fired) then the new manager will typically clean house, eliminating projects that his or her predecessor had underway often because it's not the new manager's projects. Taking time to assess every project is often not realistic, especially since the manager knows that they need to make their mark as quickly as possible or face the same fate.

One way around this is to commit to six-month delivery cycles, then deliver on them. I'd actually argue that part of the problem here is the promise (most frequently unfulfilled) of Agile methodologies. Software has not really become all that much faster, to deliver than it did in the bad old days of waterfall, but agile gives the illusion that at any stage in the process, you have a potentially viable product. In reality, most software projects do not reach a stage of useful viability until about six months in. There are exceptions, of course, but in most cases, those involve augmenting already mature stable products. Building new products from scratch in two to three months that have been thoroughly tested, have complete functionality and have adequate documentation? Not even remotely going to happen.

Put in that light, the best time to kill a project is within the first few weeks. The worst is any point after about three months, because the big costs - infrastructure, design and so forth, have already been sunk. Moreover, the risk of breach of contract litigation on the part of contractors rises proportionally to the overall time on the project, and even if a court finds in your favor, the costs and distractions due to fighting that litigation are still costs that you will likely eat.

As a useful metric, if a manager has projects in their portfolio, wait until those projects complete before promoting them to a position of broader responsibility. Seeing a project through to its completion is an integral part of management, as is being able to allocate people to take over maintenance of such projects once the projects are completed and find new champions to take on new projects.

The more abstract your data, the longer it takes to visualize it. Plan accordingly.


The more abstract your data, the longer it takes to visualize it. Plan accordingly. GETTY

The more abstract your data, the longer it takes to visualize it. Plan accordingly. GETTY


The Invisibility Syndrome

Popular media to the contrary, where you see the hacker magically brushing aside dangerous looking projections of dark ice, most coding involves, at its core, people using a glorified typewriter to perform queries against databases, build abstractions of processes and send packets of information from one server to another. It is, in a word, dull. You don't see very much happening on the screen because that's another part of the program, usually one that can only be even started once you have those other things out of the way.

This means that it is difficult to see physical progress on software until comparatively late in the development cycle. Far too many managers equate the lack of visible progress with the lack of actual progress, and, as costs begin to rise, there's a very real temptation to pull the plug because the developers don't seem to be doing anything.

Now it is possible to build progress demos, and most coders do as a means of testing their code. However, those demos take time and energy away from doing the actual coding, and, in fact, can give a false impression about how complete the project really is, because in putting together a flashy demo, you are actually subverting productive coding with unproductive coding, delaying the project even further.

As a rule of thumb, most user interfaces really begin to take shape about four months into production. This is not because user interfaces are that hard to write, but rather because the interface is just that - a shell for different services and daemons to interact with to give the user access into the program. If those services and daemons don't exist, you usually can't make the shell work properly. Since most people judge a program on that user interface, that means that a manager has to take a leap of faith for about four months before that faith can be rewarded.

Once you commit to doing a project, assume you will be adding a division to your company for the next six months. You get no savings if you skimp when building and then fail to complete the project.


Once you commit to doing a project, assume you will be adding a division to your company for the next six months. You get no savings if you skimp when building and then fail to complete the project. GETTY

Once you commit to doing a project, assume you will be adding a division to your company for the next six months. You get no savings if you skimp when building and then fail to complete the project. GETTY


Don't Under-Budget

Most commercial, off-the-shelf software can seem fairly expensive, especially when such software is a data server of some sort. In reality, however, that software represents investments on the order of tens of millions of dollars, employing dozens of coders and perhaps hundreds of support people, mostly guaranteeing things like good documentation, robust testing, solid designers and the ability to deploy the software on often wildly different platforms.

Open source software is free. However, it lacks many of the things that differentiate most COTS software, including indemnity protection. Backwards-compatibility is iffy at best, and the likelihood that you will need to hire more expensive programmers rises dramatically because the interfaces are generally oriented towards knowledgeable specialists.

Bespoke, custom software is the most expensive option of all. As a consultant, it's not in my interest to tell you this, but most consultants are well aware of this fact. Bespoke, specialized software frequently involves unknowns - how do you solve a particular problem, how do you scale efficiently, how do you ensure that only the right people can use it, and so forth. It requires that design work, developing good user experiences, is carried out, and software design is still more art than science.

If you're going to undertake custom software, assume that you will need to create the equivalent of a new department for the duration of the project. It takes time to put together a team, and unless that team is from a smaller boutique consultancy, it's very likely that the people in the team have never worked together before, meaning that a significant portion of the time in putting together software is in establishing protocols for communication, getting people on-boarded and up to speed on the project, and getting them access into the systems that they need.

A company could save a man-month's worth of money per contractor by making sure that they have all the access that they need the moment they walk in the door, yet I cannot think of more than a couple of companies in my decades of experience that actually recognized this fact. Not surprisingly, they also had the highest project completion rates.


Once an optimal size is reached, adding more people to a project is counterproductive - it can make things worse. GETTY

Once an optimal size is reached, adding more people to a project is counterproductive - it can make things worse. GETTY


Pouring People Into Projects

Many managers see software development as if it obeyed a linear rule: the more people that you put into a project; the faster it will get done. In reality, a better analogy would be to think about the oil in your car. Too little oil and your bearings start to scrape together, rotors scrape against surfaces, pistons wear out faster. As you add oil, these things no longer take place, and your car begins to run more smoothly.

However, put too much oil in your car and oil pressure climbs, leaks begin to form and eventually cascade, the engine starts smoking, and you end up with oil being burnt and adding grime into the system, which causes everything to start falling apart. Soon your car is generating big plumes of smoke, and you get pulled over by the cops. I speak from experience here.

To a certain extent, adding more people if you are understaffed will mean that people have fewer things they have to do simultaneously (which means multitasking, which doesn't really work), and can concentrate more on one thing at a time. Certain areas, like architecture and design, are usually actually managed best by one or maybe two people because they are establishing a vision. A team of architects is an oxymoron and is almost always a signal that you're overstaffed.

Good management of contractors is always about ensuring you have enough hands without compromising communication costs. When you have too many people, the number of steps a message has to hop to get to its intended recipient increases dramatically, the likelihood of corruption of that message gets higher, and the overall time devoted to dealing with that communication increases in a non-linear fashion. This can be exacerbated in a scrum environment because the rationale for implementing a particular component or portion can get lost with more layers of communication, often leading to a developer not having critical information when they most need it.

Admittedly, most staffing agencies, while aware of this, generally dismiss it because the more seats they can fill, the more money they make - regardless of whether the project reaches a successful conclusion or not. It should be pointed out as well that staffing tends to put more junior people into critical positions to give them on-the-job training and because many staffing agencies find it difficult to find and retain senior people. The idea that two or three junior developers are going to be better than one seasoned professional seems to have become well established in Silicon Valley and elsewhere, but ideally, you really want a spectrum of seniority in your consulting organizational chain, just as you would in your own organization.

However, in general, the temptation should be to avoid staffing up mid-project. If you're hiring people in month four of a six-month project, they will really only give you about two weeks of usable time. I've found that a good time to rethink your staffing is again at the sixth-month point, as your requirements will generally shift into a different mix of maintainers on the existing project and architects and developers looking to expand what built into different areas while the initial product undergoes real-world testing.


As one project wraps up, the architects, designers and stakeholders should be looking at how to keep the momentum going into new projects, based upon what's been created. 

As one project wraps up, the architects, designers and stakeholders should be looking at how to keep the momentum going into new projects, based upon what's been created. GETTY


Plan for the Future

Software is never really done. There are always functions that were planned but were never implemented, bugs that will only surface once there's been enough real-world testing (e.g., everyday use), and changes that need to be made when underlying libraries or modules break. Additionally, good software needs to have some form of documentation written, a task that is frequently better done by a trained user rather than a coder.

Maintenance typically is handled by in-house coders, but extensions (and refactors) are frequently managed under separate projects and are as often as not performed by contractors. In the last couple of months of a project, this is a good time to bring in the architects and designers to start sketching out what such extensions would look like, as their primary role - designing architectures, UX or data design - is usually drawing to a close by then.

They key here is to take advantage of momentum. Design is an integral part of software development, but it can often eat up a couple of months before a coder writes "Hello, World!" By shifting the design process into the tail end of the previous development cycle, it ensures there are plans in place and coders can begin coding with little interruption. Allowing for a month of lag time also makes it possible to fine-tune plans, change staffing levels and give everyone a chance for closure. From a consulting standpoint, this also gives consultants a chance to determine whether they want to tackle a new project or move on, while not restarting the whole process for the beginning.

Enterprise data management is not application data management. Understanding the scope of both data and metadata is key to managing enterprise projects. 

Conclusion

Enterprise data is not application data, although application data should use enterprise data whenever possible. This means that enterprise data projects should identify data scopes, should apply strong governance principle at all levels of the organization while at the same time recognizing that such data (and the requirements that drive acquisition of that data) will evolve over time.

To the extent possible, get your data team involved early in any project, and ensure that non-technical managers gain at least some training in data governance, the forms and dynamics of both data and metadata, and how what they do impacts, and are impacted by, the company data strategy. Assess your needs, spend some money up-front to prototype a few different approaches before committing at a large scale, and where possible scope your data requirements to the scope of the division that such work is intended for. Finally, understand that the more you focus on enterprise-centric data, the more abstract the process, the longer it will be before you see any meaningful result.


Kurt Cagle is Managing Editor for Cognitive World, and is a contributing writer for Forbes, focusing on future technologies, science, enterprise data management, and technology ethics. He also runs his own consulting company, Semantical LLC, specializing on Smart Data, and is the author off more than twenty books on web technologies, search and data. He lives in Issaquah, WA with his wife, Cognitive World Editor Anne Cagle, daughters and cat (Bright Eyes).