CTO of MyBuilder.com - The better way to find a builder. Built on PHP, Symfony, PostgeSql
20 stories
·
2 followers

Unit Testing Code Boundaries

2 Shares

When I first learned to unit test my software, I noticed that I struggled to test code that interacted with components at the boundaries of my design. These components would often be input/output related, whether that was getting input from the command line or firing off a request to an external HTTP API. Due to these external dependencies, I couldn't work out how to write a test that didn't need to use the service in some way or another. Having read that I should keep my tests decoupled from any external dependencies, I regularly chose not to test these aspects of the system due to it being "too hard" to do.

Hopefully you will agree that my initial approach was a bad decision. Such components serve an important role in the systems we develop—they are the interfaces through which we send and receive the data our application depends upon, which implies that they are also depended upon by our customers, and, ipso facto, our businesses.

By not writing unit tests to cover my software's interaction with boundary code, I was introducing a great deal of risk into my applications, reducing the flexibility and extendability of my software, and potentially increasing the cost of future changes.

After battling with this problem for a while, I started to learn about mocks. Ah ha, I hear you say. At last! Unfortunately, this sad little story doesn't quite end there. Mocking is certainly a good solution, and as I investigated the various possible mocking frameworks, I still found that I was struggling. Most of the examples I found relied on mocking the method calls and responses of library software designed to interact with an external API.

This meant that I needed an intimate knowledge of the internals of the library, which I just didn't have. Further, my code was completely coupled to the library code, and any changes to it rippled throughout my software. It also meant I couldn't easily swap out one external service for another without significant changes to my code. That smells like a violation of several SOLID principles to me. In short, I still hadn't resolved the business-impacting problems I wrote about above.

I think that my problems stemmed, in large part, from the wealth of literature explaining the concepts of unit testing and test-driven-development. I agree with my colleague, Thomas Countz, who opined in his recent article on Essential and Relevant Unit Tests that much has been written on why to do unit testing but little has been written on how to do unit testing well.

A few weeks ago I was ramping up on a new project that is primarily written in Python. Coming from a Java background, I'm familiar with the main unit testing and mocking frameworks, as well as writing manual mocks. My experience with Python is more limited and I wanted to know what mocking frameworks were available. As I researched the popular tools, I noticed that their "Quick Start" guides gave examples of directly mocking methods on external APIs. I believe this is bad advice that is likely to trip up new developers, as well as a fair number of seasoned pros. So what's the solution?

In the remainder of this article I'll explain what boundary code is, highlight the benefits of designing the boundaries of a system with testability in mind, cover the key areas in which I think developers can improve their unit tests, and provide a couple of examples along the way. I'll use both Python and Java for the examples, since they are popular languages.

What is boundary code?

When I talk about boundary code, I mean any part of a system that interacts with the outside world. That is, anything that receives input or transmits output. One way of categorising such boundary components comes from Hexagonal Architecture—any component that is a port is considered to be on the boundary.

There are two main types of boundaries, one of which is significantly more obvious than the other:

  1. External systems.
  2. Standard libraries and packages.

Code that interacts with external systems is, I hope, an obvious boundary component. Examples include interacting with database systems, file systems, REST APIs, audio devices, etc.

Standard libraries and packages are perhaps less obviously boundary components. These libraries are often provided by programming language authors to make developers’ lives easier, and are in many instances baked into our language of choice. But let's take a quick moment to examine the functionality that some of them provide: filesystem IO, console IO, HTTP clients, etc. Since these types of libraries exist to deal with getting input and receiving output, I consider them to be boundary components.

What are the benefits of designing boundaries with testability in mind?

As I mentioned above, the main problems that arise when developers are encouraged to couple their code to a boundary are that:

  • Decision making cannot be deferred.
  • Risk is increased.
  • Flexibility is reduced.
  • The cost of changes goes up.

As software developers, we should strive to give our teams as many options as possible for as long as possible, reduce risk, increase flexibility, and minimise costs. Doing so provides them with the best opportunity to respond and adapt to changing market demands. Whilst effectively tested code boundaries is not the prime factor to determine how well software meets these goals, it goes a long way. As my mum used to say when getting me to tidy my room, take care of the edges and the rest takes care of itself.

How can unit testing at code boundaries be improved?

As I outlined above, I believe there are two key areas that constitute boundary code: code that interacts with obviously external systems and code that uses standard libraries to interact with the local system. The solution is the same for each of these—by employing the Adapter Pattern we can use the Humble Object test pattern to push the complexity of interacting with code boundaries to the very edges of our system. Below, I'll provide an example for each of the two cases in turn.

Unit testing code boundaries with external systems

When developing unit tests, it can be hard to write tests for code that interacts with external systems. This is because we are writing code that interacts with services that we do not own and which, more often than not, do not have a deterministic output. In terms of functional programming, functions that interact with external systems cause side-effects. Since components at the boundary rely on side effects, they are at odds with the deterministic nature of unit testing. We regularly use third-party libraries and frameworks to help us interact with such systems.

We can avoid directly mocking calls on these external APIs by defining an interface that serves as an adapter to the service. The adapter wraps the code that interacts with the external service with methods that describe the functionality we need from the external service. We then implement this interface with the code that calls the real API and as many mocks as we need for testing purposes. In a dynamic language such as Python, we use duck typing to achieve this.

For example, suppose I am writing an application that needs to download CSV files from an Amazon S3 bucket. Further, suppose that the business has said they will be moving all their infrastructure over to a similar service on the Microsoft Azure platform within the next 12 months. If I don't use a wrapper, my code might look something like the following:

import io
import os

import boto3


def download_csv(filename):
    s3_client = boto3\
        .session\
        .Session()\
        .client(service_name='s3',
                endpoint_url=os.environ['S3_URL'])

    with io.BytesIO() as data:
        s3_client.download_fileobj(
            os.environ['BUCKET_NAME'], filename, data
        )

    return io.StringIO(data.getvalue().decode('UTF-8'))

By embedding the code that interacts with S3 directly in the program code, I am unable to test anything that calls it. As such, I'm unlikely to test any behaviour of this part of the program. To overcome this, I can use the Adapter pattern create a wrapper object:

import io
import os

import boto3


class S3Repository(object):
    def download_csv(self, filename):
        s3_client = boto3\
            .session\
            .Session()\
            .client(service_name='s3',
                    endpoint_url=os.environ['S3_URL'])

        with io.BytesIO() as data:
            s3_client.download_fileobj(
                os.environ['BUCKET_NAME'], filename, data
            )

        return io.StringIO(data.getvalue().decode('UTF-8'))

Not much has changed—the download_csv() function is now a method on an S3Repository class, which contains the complex and hard to test logic. Any clients of the class no longer rely on the implementation of the download_csv() method—they have become Humble Objects. I can use duck typing to substitute a mock wrapper in place of the real implementation during tests (note that we should extract the logic to load the environment variables into their own adapter too):

import io


def example(filename, repository):
    return repository.download_csv(filename)


class TestRequiringS3Repository(object):
    def test_example(self):
        repository = MockS3Repository()
        csv = example("filename.csv", repository)

        assert csv.read() == "Some,Test,Data\n1,2,3"
        assert repository.download_csv_called_with_filename == "filename.csv"


class MockS3Repository(object):
    def __init__(self):
        self.download_csv_called_with_filename = ""

    def download_csv(self, filename):
        self.download_csv_called_with_filename = filename
        return io.StringIO("Some,Test,Data\n1,2,3")

My unit test is now decoupled from boto3 and thus the dependence on Amazon S3, meaning I can test behaviour without needing to have the necessary settings configured to connect to the correct S3 account. I can use different mocks to exhibit different behaviour, such as when a file is not found, or if there are network issues. When I need to swap Amazon S3 for Microsoft Azure, I only need to implement a new class wrapping the same functionality but for Azure. If the external library changes its API, such changes are isolated to one place in my code.

You may be wondering how you would test the code that actually communicates with the S3 bucket. Such code should certainly be tested, but it is not the role of a unit test to do so. Instead, exercise this functionality with automated integration tests—these need not, and should not, test every single eventuality, but they should verify that the repository code works as expected in at least one scenario.

Unit testing code boundaries with standard libraries

A similar approach should be taken when testing code that uses standard libraries to perform IO-type actions. The main difference, and perhaps objection, when doing so is that these APIs are generally stable and unlikely to change. However, using a wrapper object makes testing the behaviour at the boundary much simpler.

To demonstrate this I'll use a simple example, which asks a user for their name and tells them how many letters it contains. The main logic is counting the number of characters in the input string, but to test that everything is wired up correctly, I want to simulate input and output. This can be done with the following test:

@Test
void tellsAshleyHeHasSixLettersInHisName() {
  MockConsole console = new MockConsole("Ashley");
  CliApp app = new CliApp(console);

  app.run();

  assertTrue(console.displayedMessageWas("Ashley has 6 letters"));
}

To implement this, I created a simple interface called Console—not to be confused with java.io.Console—that serves as an Adapter for Java’s input and output functionality, making it a Humble Object too. My mock console then simulates the input and asserts that the correct output was requested:

public interface Console {
  String getName();

  void displayMessage(String message);
}

public class MockConsole implements Console {
  private String input;
  private String message;

  public MockConsole(String input) {
    this.input = input;
  }

  @Override
  public String getName() {
    return input;
  }

  @Override
  public void displayMessage(String message) {
    this.message = message;
  }

  public boolean displayedMessageWas(String expectedMessage) {
    return message.equals(expectedMessage);
  }
}

The application logic is very simple, as follows:

public class CliApp {
  private Console console;

  public CliApp(Console console) {
    this.console = console;
  }

  public void run() {
    String name = console.getName();
    int numberOfLetters = name.length();
    console.displayMessage(String.format("Ashley has %d letters", numberOfLetters));
  }
}

By declaring Console as a wrapper interface, I am able to remove any dependency on an IO library, resulting in more robust tests, flexibility and options—in this trivial example that would most likely only be reading from System.in and writing to System.out, but I hope you can see how powerful this approach to unit testing code boundaries with standard libraries is. Adapters can prove especially beneficial when working with more complex libraries, such as sockets.

Wrapping up

Unit testing is an important tool in a developer’s toolbox. However, it is not enough to merely have the tool at our disposal. Rather, we need to become adept at wielding it in each of the situations it is needed. One area where I think existing instruction could be improved is how to use unit testing to improve the quality of our code boundaries. By doing so, we benefit by designing software with reduced risk, increased flexibility, and decreased cost of changes. To enable these benefits we can use the Adapter and Humble Object patterns.

Read the whole story
splicer
2052 days ago
reply
London
Share this story
Delete

Insight into Site Reliability Engineering with Niall Murphy

1 Share

In a recent podcast, I was lucky to have a discussion with Niall Murphy about the role of Site Reliability Engineering. Having contributed to the seminal SRE book, and having experience in this field for many years it was an honour to get the opportunity to chat to him.

Humans have been thinking about better ways to operate things for millennia, but despite all of this effort and thought, running enterprise software operations well remains elusive for many organisations.

The underlying inceptives for both Development and Operations can seem to be at odds with each other. One wishing to make change and add new features (Dev), whilst the other ensuring the product/service does not break (Ops). The catch here being, that changing the product increases the possibility of something breaking.

As a result of this realisation many forms of Gatekeeping (launch reviews, deep-dives and checklists) have been put in place to ‘help’ mitigate the friction between the two parties, but this is by no means solving the problem. It was very interesting for Niall to share his experience with these problems, and how the role and philosophy behind it goes about to help remedy this. Though the episode we were able to to delve into some of the key components that compose to become SRE, from the value of having an Error Budget, to the realisation that striving for 100% uptime is actually detrimental to the product itself!

You are able to listen to the episode in it entirety below, or by subscribing to the podcast.

Read the whole story
splicer
2425 days ago
reply
London
Share this story
Delete

Understanding Uber: It’s Not About The App

2 Comments

On Friday 22 September, many Londoners who regularly use Uber received an email. “As you may have heard,” it began, “the Mayor and Transport for London have announced that they will not be renewing Uber’s licence to operate in our city when it expires on 30 September.”

“We are sure Londoners will be as astounded as we are by this decision,” the email continued, with a sense of disbelief. It then pointed readers towards an online petition against this attempt to “ban the app from the capital.”

Oddly, the email was sent by a company that TfL have taken no direct action against, and referred to an app that TfL have made no effort (and have no power) to ban.

When two become one

If that last statement sounds confusing, then that’s because – to the casual observer – it is. This is because the consumer experience that is “Uber” is not actually the same as the companies that deliver it.

And “companies” is, ultimately, correct. Although most users of the system won’t realise it, over the course of requesting, completing and paying for their journey an Uber user in London actually interacts with two different companies – one Dutch, one British.

The first of those companies is Uber BV (UBV). Based in the Netherlands, this company is responsible for the actual Uber app. When a user wants to be picked up and picks a driver, they are interacting with UBV. It is UBV that request that driver be dispatched to the user’s location. It is also UBV who then collect any payment required.

At no point, however, does the user actually get into a car owned, managed or operated by UBV. That duty falls to the second, UK-based company – Uber London Ltd. (ULL). It is ULL who are responsible for all Uber vehicles – and their drivers – in London. Just like Addison Lee or any of the thousands of smaller operators that can be found on high streets throughout the capital, ULL are a minicab firm. They just happen to be one that no passenger has ever called directly – they respond exclusively to requests from UBV.

This setup may seem unwieldy, but it is deliberate. In part, it is what has allowed Uber to skirt the blurred boundary between being a “pre-booked” service and “plying-for-hire” (a difference we explored when we last looked at the London taxi trade back in 2015). It is also this setup that also allows Uber to pay what their critics say is less than their fair share’ of tax – Uber pays no VAT and, last year, only paid £411,000 in Corporation Tax.

The average Londoner can be forgiven for not knowing all of the above (commentators in the media, less so). In the context of the journey, it is the experience that matters, not the technology or corporate structure that delivers it. In the context of understanding the causes – and likely outcome – to the current licensing situation, however, knowing the difference between the companies that make up that Uber experience is important. Because without that, it is very easy for both Uber’s supporters and opponents to misunderstand what this dispute is actually about.

The raw facts

Uber London Ltd (ULL) are a minicab operator. This means they require a private hire operator’s licence. Licences last five years and ULL were last issued one in May 2012. They thus recently applied for its renewal.

ULL were granted a four-month extension to that licence this year. This was because TfL, who are responsible for regulating taxi services in London, had a number of concerns that ULL might not meet the required standard of operational practice that all private hire operators – from the smallest cab firm to Addison Lee – are required to meet. Issuing a four-month extention rather than a five-year one was intended to provide the time necessary to investigate those issues further.

On Friday 22 September, TfL announced that they believe ULL does not meet the required standard in the following areas:

  • Their approach to reporting serious criminal offences.
  • Their approach to how medical certificates are obtained.
  • Their approach to how Enhanced Disclosure and Barring Service (DBS) checks are obtained.
  • Their approach to explaining the use of Greyball in London – software that could be used to block regulatory bodies from gaining full access to the app and prevent officials from undertaking regulatory or law enforcement duties.

As a result, their application for a new licence has been denied.

ULL have the right to appeal this decision and can remain in operation until that appeal has been conducted. Similarly, if changes are made to their operational practices to meet those requirements to TfL’s satisfaction, then a new licence can be issued.

Put simply, this isn’t about the app.

So why does everyone think it is?

Washington DC, September 2012

“I know that you like to cast this as some kind of fight,” said Mary Cheh, Chair of the Committee on Transport and the Environment, “Do you understand that? I’m not in a fight with you.”

“When you tell us we can’t charge lower fares, offer a high-quality service at the best possible price, you are fighting with us.” Replied Travis Kalanick, Uber’s increasingly high profile (and controversial) CEO.

“You still want to fight!” Cheh sighed, throwing her hands in the air.

Back in San Francisco, Salle Yoo, Uber’s chief counsel, was watching in horror via webcast. Pulling out her phone, she began frantically texting the legal team sitting with Kalanick in the room:

Pull him from the stand!!!

It was too late. Kalanick had already launched into a monologue on toilet roll prices in Soviet Russia. He had turned what had been intended as a (relatively) amicable hearing about setting a base fare for Uber X services in the city into an accusation – and apparent public rejection by Kalanick – of an attempt at consumer price fixing.

The events that day, which are recounted in Brad Stone’s ‘The Upstarts’, were important. In hindsight, they marked the point where Uber shifted gears and not only started to aggressively move in on existing taxi markets, but also began to use public support as a weapon.

Cause and effect

Weirdly, one of the causes for that shift in attitude and policy was something London’s Black Cab trade had done.

Kalanick and fellow founder Garrett Camp had launched Uber in 2008 with a simple goal – to provide a high-quality, reliable alternative to San Francisco’s notoriously awful taxis.

What’s important to note here is that neither man originally saw Uber as a direct price-for-price rival to the existing LA taxi trade. LA, like many cities in the USA, utilised a medallion system to help regulate the number of taxi drivers in operation at any one time. Over the years, the number of medallions available had not increased to match rising passenger demand.

Camp – who had moved to LA after the sale of his first startup, StumbleUpon, became increasingly frustrated at his inability to get around town. Then Camp discovered that town car licences (for limousine services) weren’t subject to the medallion limits. Soon he began to float the idea of a car service for a pool of registered users that relied on limousine licences instead.

This would be more expensive than a regular cab service, but he argued that the benefits of better quality vehicles and a more reliable service would make it worthwhile in the eyes of users. A friend and fellow startup entrepreneur, Kalanick, agreed. They hired some developers and then started touting the idea to investors (often describing it as “AirBNB for taxis”). Uber grew from there.

As the company expanded, this ‘luxury on the cheap’ model sometimes brought Uber into conflict with the existing US taxi industry and individual city regulators. The fact that they were rarely undercutting the existing market helped limit resistance, however.

What eventually shifted Uber into a different gear was the arrival of a threat from abroad – Hailo.

Hailo wars

Founded by Jay Bregman in 2009, Hailo was a way for London’s Black cab trade to combat the inroads private hire firms had been making into their market share. Those firms were starting to use the web and digital technology to make pre-booking much more convenient. Bregman had seen Uber’s app and realised the potential. He created Hailo as a way to help Black Cabs do the same thing.

At this, Hailo was initially successful. Bregman, an American by birth, soon started casting his eyes across the Atlantic at the opportunities to do the same thing there. In March 2012, Bregman announced that Hailo had raised $17m to fund an expansion into the US, where it would attempt to partner with existing cab firms as it had done in London.

Expansion into London had already been on Uber’s radar. They had also been aware of Hailo. Bregman’s announcement, however, turned a potential rival in an overseas market into a direct, domestic threat. Uber’s reaction was swift and aggressive, as was the ‘app war’ which soon erupted in cities such as Boston and New York where both firms had a presence.

One of the crucial effects of the Hailo wars was that they finally settled a long-running argument that had existed over Uber’s direction between its two founders. Camp had continued to insist that Uber offer luxury at a (smallish) premium. Kalanick had argued that it was convenience, at a low cost, that would drive expansion. When Hailo crossed the pond, offering a low-cost service, Kalanick’s viewpoint finally won out based on necessity.

Controlling the debate

As the Washington hearing would show later that year, Kalanick’s victory had enormous consequences – not just in terms of how the service was priced and would work (it would lead to the launch of Uber X, the product with which most users are familiar), but in how Uber would pitch itself to the public.

The approach that Kalanick took in his Washington testimony, of espousing the public need as being the same as Uber’s need, has since become a standard part of Uber’s tactics for selling expansion into new markets. The ability – often correct – to claim that Uber offers a better service at a cheaper price is powerfull selling point, one that Uber have never shied away from pushing.

It’s a simple argument. It is also one that Uber have used to drown out more complex objections from incumbent operators, regulators or politicians in areas into which they’ve expanded. It is also one of the reasons why Uber have continued to push the narrative that they are a technical disruptor when skirting (or sometimes ignoring) existing regulations – because being an innovative startup is ‘sexy’. Being a large company ignoring the rules isn’t.

Back to the licence

Understanding where Uber have come from, and their approach to messaging is critical to understanding the London operator licence debate. Uber may have tried to frame it as a debate about the availability (or otherwise) of the app, but that’s not what this is. It is a regulatory issue between TfL as regulator and ULL as an operator of minicabs.

The decision to cast the debate in this way is undoubtedly deliberate. Uber are aware that their users are not just passengers, but a powerful lobbying group when pointed in the right direction- as long as the message is something they will get behind. Access to the Uber app is a simple message to sell, the need to lighten ULL’s corporate responsibilities is not.

Corporate responsibilities

One of the primary responsibilities of the taxi regulator in most locations is the consideration of passenger safety. This is very much the case in London – both for individual drivers and for operators.

The expectation of drivers is relatively obvious – that they do not break the law, nor commit a crime of any kind. The expectation of operators is a bit more complex – it is not just about ensuring that drivers are adequately checked before they are hired, but also that their activity is effectively monitored while they are working and that any customer complaints are taken seriously and acted upon appropriately.

The nature of that action can vary. The report of a minor offence may warrant only the intervention of the operator themselves or escalation to TfL. It is expected, however, that serious crimes will be dealt with promptly, and reported directly to the police as well.

On 12 April 2017, the Metropolitan Police wrote to TfL expressing a major concern. In the letter, Inspector Neil Bellany claimed that ULL were not reporting serious crimes to them. They cited three specific incidents by way of example.

The first of these related to a ‘road rage’ incident in which the driver had appeared to pull a gun, causing the passenger to flee the scene. Uber dismissed the driver, having determined that the weapon was a pepper spray, not a handgun, but failed to report the incident to the police. They only became aware of it a month later when TfL, as operator, processed ULL’s incident reports.

At this point, the police attempted to investigate (pepper spray is an offensive weapon in the UK) but, the letter indicated, Uber refused to provide more information unless a formal request via the Data Protection Act was submitted.

The other two offences were even more serious, and here it is best simply to quote the letter itself:

The facts are that on the 30 January 2016 a female was sexually assaulted by an Uber driver. From what we can ascertain Uber have spoken to the driver who denied the offence. Uber have continued to employ the driver and have done nothing more. While Uber did not say they would contact the police the victim believed that they would inform the police on her behalf.

On the 10 May 2016 the same driver has committed a second more serious sexual assault against a different passenger Again Uber haven’t said to this victim they would contact the police, but she was, to use her words, ‘strongly under the impression’ that they would.

On the 13 May 2016 Uber have finally acted and dismissed the driver, notifying LTPH Licensing who have passed the information to the MPS.

The second offence of the two was more serious in its nature. Had Uber notified police after the first offence it would be right to assume that the second would have been prevented. It is also worth noting that once Uber supplied police with the victim’s details both have welcomed us contacting them and have fully assisted with the prosecutions. Both cases were charged as sexual assaults and are at court next week for hearing.

Uber hold a position not to report crime on the basis that it may breach the rights of the passenger. When asked what the position would be in the hypothetical case of a driver who commits a serious sexual assault against a passenger they confirmed that they would dismiss the driver and report to TfL, but not inform the police.

The letter concluded by pointing out that these weren’t the only incidents the Metropolitan Police had become aware of. In total, Uber had failed to report six sexual assaults, two public order offences and one assault to the police. This had lead to delays of up to 7 months before they were investigated. Particularly damning, with the public order offences this meant that in both cases the prosecution time limit had passed by the time the police became aware of them.

As the letter concludes:

The significant concern I am raising is that Uber have been made aware of criminal activity and yet haven’t informed the police. Uber are however proactive in reporting lower level document frauds to both the MPS and LTPH. My concern is twofold, firstly it seems they are deciding what to report (less serious matters / less damaging to reputation over serious offences) and secondly by not reporting to police promptly they are allowing situations to develop that clearly affect the safety and security of the public.

The Metropolitan police letter is arguably one of the most important pieces of evidence as to why TfL’s decision not to renew ULL’s licence is the correct one right now. Because one of the most common defences of Uber is that they provide an important service to women and others late at night. In places where minicabs won’t come out, or for people whose personal experience has left them uncomfortable using Black Cabs or other minicab services, Uber offer a safe, trackable alternative.

The reasoning behind that argument is completely and entirely valid. Right now, however, TfL have essentially indicated that they don’t trust ULL to deliver that service. The perception of safety does not match the reality.
Again, it is not about the app.

Greyball

Concerns about vetting and reporting practices in place at ULL may make up the bulk of TfL’s reasons for rejection, but they are not the only ones. There is also the issue of Greyball – a custom piece of software designed by Uber which can provide the ‘real’ Uber map that the user sees on their device with a convincing fake one.

Greyball’s existence was revealed to the world in March 2017 as part of an investigation by the New York Times into Uber’s activities in Portland back in 2014. The paper claimed that knowing that they were breaking the regulations on taxi operation in the city, Uber had accessed user data within its app to identify likely city officials and target them with false information. This ensured that those people were not picked up for rides, in turn hampering attempts by the authorities to police Uber’s activities there.

Initially, Uber denied the accusations. They confirmed that Greyball existed, but insisted that it was only used for promotional purposes, testing and to protect drivers in countries where there was a risk of physical assault.

Nonetheless, the seriousness of the allegations and the evidence presented by the New York Times prompted Portland’s Board of Transport (PBOT) to launch an official investigation into Uber’s activities. That report was completed in April. It was made public at the beginning of September. In it, Portland published evidence – and an admission from Uber itself – that during the period in which it had been illegal for Uber to operate in Portland, they had indeed used it to help drivers avoid taxi inspectors. In Portland’s own words:

Based on this analysis, PBOT has found that when Uber illegally entered the Portland market in December 2014, the company tagged 17 individual rider accounts, 16 of which have been identified as government officials using its Greyball software tool. Uber used Greyball software to intentionally evade PBOT’s officers from December 5 to December 19, 2014 and deny 29 separate ride requests by PBOT enforcement officers.

The report did confirm that, after regulatory changes allowed Uber to enter the market legally, there seemed to be no evidence that Greyball had been used for this purpose again, As the report states, however:

[i]t is important to note that finding no evidence of the use of Greyball or similar software tools after April 2015 does not prove definitively that such tools were not used. It is inherently difficult to prove a negative. In using Greyball, Uber has sullied its own reputation and cast a cloud over the TNC [transportation network company] industry generally. The use of Greyball has only strengthened PBOT’s resolve to operate a robust and effective system of protections for Portland’s TNC customers.

Portland also went one further. They canvassed other transport authorities throughout the US asking whether, in light of the discovery of Greyball, they now felt they had evidence or suspicions that they had been targeted in a similar way. Their conclusions were as follows:

PBOT asked these agencies if they have ever suspected TNCs of using Greyball or any other software programs to block, delay or deter regulators from performing official functions. As shown in figure 3.0 below, seven of the 17 agencies surveyed suspected Greyball use, while four agencies (figure 3.1) stated that they have evidence of such tactics. One agency reported that they only have anecdotal evidence, but felt that drivers took twice as long to show up for regulators during undercover inspections. The other agencies cities believe that their enforcement teams and/or police officers have been blocked from or deceived by the application during enforcement efforts.

Uber are now under investigation by the US Department of Justice for their use of Greyball in the US.

Of all the transport operators in Europe, TfL are arguably the most technically literate. It is hard to see how the potential use of Greyball wouldn’t have raised eyebrows within the organisation so it is not surprising to see it make the list of issues. A regulator is only as good as their ability to regulate, and as the Portland report shows, Uber now have ‘form’ for blocking that ability.

Sources suggest that TfL have requested significant assurances and guarantees that Greyball will not be used in this way in London. The fact that it makes the list of issues, however, suggests that this demand has currently not been met. It is possible this is one of the times when Uber’s setup – multiple companies under one brand – has caused a problem outside of ULL’s control. Uber Global may ultimately be the only organisation able to provide such software assurances.

Perhaps Uber Global is the only organisation able to provide such assurances. Until now, they may simply not have realised just how important it was that they give them.

Understanding the economics

There is still much more to explore on the subject of Uber. Not just Uber London’s particular issues with TfL, but the economics of how they operate and what their future plans might be.

That last part is important because the main element of Uber’s grand narrative – their continued ability to offer low fares – is not as guaranteed a prospect as Londoners (and indeed all users) have been led to believe.

We will explore this more in our next article on the subject but, in the context of the current debate, it is worth bearing something in mind: Uber’s fares do not cover the actual cost of a journey.

Just how large the deficit is varies by territory and – as the firm don’t disclose more financial information than necessary – it is difficult to know what the shortfall per trip is in London itself. In New York, however, where some 2016 numbers are available, it seems that every journey only covers 41% of the costs involved in making it.

Just why Uber do this is something we will explore another time, but for now it is important just to know it is happening. It means that, without significant changes to Uber’s operational model, the company will never make a profit (indeed it currently loses roughly $2bn a year). As one expert in transport economics writes:

Thus there is no basis for assuming Uber is on the same rapid, scale economy driven path to profitability that some digitally-based startups achieved. In fact, Uber would require one of the greatest profit improvements in history just to achieve breakeven.

What it does mean though is that Uber’s cheap fares – sometimes argued as one of the ways in which it provides a ‘social good’ for low-income users – are likely only temporary.

Indeed the only way this won’t be the case is if there is a significant technical change to the way Uber delivers its service. In this regard, Uber has often pulled on its reputation as a ‘startup’ and has pointed to the economies of scale made by companies such as Amazon.

Unfortunately, this simply isn’t how transport works. Up to 80% of the cost of each Uber journey is fixed cost – it goes on the driver, the fuel and the vehicle. This is a cost which scales in a linear fashion. Put simply, the number of books Amazon can fit in a warehouse once it’s been built (and paid for) increases exponentially. The number of passengers Uber can stick in a car does not.

Uber, of course, are aware of this. Indeed it’s why they have quickly become one of the biggest investors in self-driving vehicle technology (and are subject to a lawsuit from Google over the theft of information related to that subject).

Again we will explore this more at a later date, but for now it is worth bearing in mind that behind Uber’s stated concern for their ‘40,000 drivers’ in London should be taken with a considerable pinch of salt. Not only is the active figure likely closer to 25,000 (based on Uber’s own growth forecasts from last year), but they would also quite like to get rid of them anyway – or at the very least squeeze their income further in order to push that cost-per-journey figure closer to being in the black.

Bullying a bully

None of these issues with Uber’s operational are likely exclusive to London. Which begs the question – why have TfL said ‘no’ when practically everyone else has said ‘yes’?

To a large extent, the extreme public backlash this news received, and the size of Uber’s petition provide the answer – because Uber are a bully. Unfortunately for them, TfL can be an even bigger one.

TfL aren’t just a transport authority. They are arguably the largest transport authority in the world. Indeed legislatively speaking TfL aren’t really a transport authority at all (at least not in the way most of the world understands the term). TfL are constituted as a local authority. One with an operating budget of over £10bn a year. They also have a deep reserve of expertise – both legal and technical.

Nothing to divide

To make things worse for Uber, TfL aren’t accountable to an electorate. They serve, and act, at the pleasure of just one person – the Mayor of London, the third most powerful directly elected official in Europe (behind the French and Russian presidents).

This is a problem for Uber. In almost every other jurisdiction they have operated in, Uber have been able to turn their users into a political weapon. That weapon has then been turned on whatever political weak point exists within the legislature of the state or city it is attempting to enter, using popularism to get regulations changed to meet Uber’s needs.

The situation in London is practically unique, simply because there is only one weak point that can be exploited – that which exists between TfL and the Mayor.

Just how much direct power the Mayor of London exercises over TfL is one of the themes that has been emerging from our transcripts of the interviews conducted for the Garden Bridge Report. To quote the current Transport Commissioner, Mike Brown, in conversation with Margaret Hodge, MP:

Margaret Hodge: But it’s your money.

Mike Brown: Yes, I know but the Mayor can do what he wants as the Chair of the TfL Board.

MH: Without accountability to the Board?

MB: Yes and Mayoral Directions are — the Mayor is actually extremely powerful in terms of Mayoral Directions. He or she can do whatever they want.

MH: What, to whatever upper limit you want?

Andy Brown: That’s right, I think, yeah.

MB: Yeah, pretty much is. Yeah — so arguably it’s more direct financial authority than even a Prime Minister would have, for example.

As long as the TfL and the Mayor, Sadiq Khan, remain in lockstep on the licence issue, therefore, Uber’s most powerful weapon has no ammunition. 500,000 signatures mean nothing to TfL if the organisation has the backing of the Mayor and they are confident of a victory in court.

It is also worth noting that all TfL really wants Uber to do is comply with the rules. Despite the image that has been pushed in some sections of the media, TfL has not suddenly become the champion of the embattled London cabbie. TfL has always seen itself as the taxi industry’s regulator, not as the Black Cab’s saviour. This was true back in 2015 when the Uber debate really erupted in earnest and it is still true now.

Indeed if TfL have any kind of ulterior motive for their actions, it is simply that they dislike the impact Uber are having on congestion within the capital, and the effect this congestion then has on the bus network. Indeed Uber would do well to remember the last time a minicab operator made the mistake of making it harder for TfL’s buses to run on time.

That the Mayor was prepared to go so public in his support for such an unpopular action should also serve as a warning for Uber.

When it comes to legal action, TfL are risk-averse in the extreme – there is a reason they have never sued the US Embassy over unpaid Congestion Charge fees. The current Mayor is even more so.

Whilst his field is human rights rather than transport, Khan is a lawyer himself and by all accounts a good one. GQ’s Politician of the Year is also an extremely shrewd political operator. It is unlikely that he would have lent his support to TfL on this subject unless he knew it was far more likely to make him look like a statesman who stands up to multinationals, than a man who steals cheap travel from the electorate.

Ultimately, the next few days (and beyond) will likely come to define the relationship between London and Uber. Indeed sources suggest that Uber have already begun to make conciliatory noises to TfL, as the seriousness of the situation bubbles up beyond ULL and UBV to Uber Global itself. Only time will tell if this is true.

In the meantime, however, the next time you see a link to a petition or someone raging about this decision being ‘anti-innovation’, remember Greyball. Remember the Metropolitan Police letter. Remember that this is about holding ULL, as a company, to the same set of standards to which every other mini-cab operator in London already complies.

Most of all though remember: it is not about the app.

In the next part of this series, we will look at the economics of Uber, their internal culture, impact on roadspace and relationship with their drivers.

The post Understanding Uber: It’s Not About The App appeared first on London Reconnections.

Read the whole story
splicer
2671 days ago
reply
Well written and impartial write up on why TFL is not renewing ULL license.
London
Share this story
Delete
1 public comment
miestasmagnus
2671 days ago
reply
As it turns out, TfL had some very good reasons for not renewing Uber's licence:

Developing with Symfony 2 and Docker

1 Share

I had a couple of aims for my most recent quick project, Dashli:

  • be able to avoid polluting my machine with PHP, MySQL, and those other services which quickly get littered around,
  • be able to really quickly deploy to a service, without having to ssh in and set everything up

Docker works well with both of these points.

There’s a simple Dockerfile which lets you write down the steps you’d take to build a box and set it up for your application. That means that during development, Docker knows how to build a virtual machine which’ll let me keep the entire server infrastructure hidden away from my “local” OS. No more worries about which PHP version I need for different applications and I don’t need to keep a MySQL server running for no good reason.

Together with tools like Docker Cloud (free with one private instance) and Digital Ocean (not free, but you can get $10 – enough for two months of a small server – when you sign up with this link), it’s super easy to have my project uploaded to a cloud instance and running without me even having to know how to SSH into the box.

I had this all set up and running brilliantly. The problem was when I came back to my machine to continue development.

Tiny change, rebuild. Tiny change, rebuild.

The most frustrating part of this new flow for me was finding a simple typo and then having to redo my docker build . step, which isn’t quick since it has to do a full composer install again.

The fix here was mounted volumes – let the virtual machine mount my project directory as nginx’s root directory. This is simple, actually. When running your docker, do this instead:

docker run --publish 80:80 --volume /local/project/path:/virtual/machine/path/to/project .

Now, when you update a file in your local project, it’ll be instantly reflected inside the docker. Refresh your browser and you’ll notice the updates.

composer files getting thrown away

You’re probably building your composer.phar install inside your Dockerfile. That will still happen. However, when Docker goes ahead and swaps out the VM’s /virtual/machine/path/to/project with your local one, it basically deletes all that is in there. That will include you vendor/ folder, which will now be empty.

That’s obvious why – you’ve never run composer locally, you don’t even have PHP installed locally so how could you! You need composer to be run within the Docker, and hopefully without having an affect on your local machine.

To do this, we can move our vendors outside of the project. I know, right. Weird. Composer has support for this though. In my composer.json you’ll seen this:

"config": {
"bin-dir": "bin",
"vendor-dir": "/tmp/dashli/vendor"
}

This tells composer to download the vendors in the /tmp/dashli/vendor directory (I chose /tmp/ as everyone can write to it – this might not be the smartest place to put it). I also had to edit my app/autoload.php so Symfony knows where the autoloader is.

$loader = require '/tmp/dashli/vendor/autoload.php';

Now your dependencies can be installed on the virtual machine, out of the way of your local machine.

Getting around root created files

The huge downside of this is that any file written by www-data user inside your container is actually going to be written locally by root. You’ll now find your local project’s var/cache/, var/session/, etc. full of files owned by root. You’ll have to sudo rm them to get rid of them, which is not something that is a part of a healthy development flow.

Symfony comes fully equipped to handle this problem though: in your app/AppKernel.php, you’ll have to change some of the overridden methods to point them to your /tmp/ directory, just like we did with composer dependencies above.

You can actually read more about this here on the Symfony docs.

Hopefully that leads you to an easy development experience with Symfony and Docker!

Read the whole story
splicer
3230 days ago
reply
London
Share this story
Delete

Insertion, Removal and Inversion Operations on Binary (Search) Trees in PHP

1 Share

Recently Max Howell (creator of Homebrew) posted an interesting tweet in regard to Google's interview process. In this tweet he mentioned how one of the proposed questions was to white-board a solution to invert a binary tree. Over the past couple of years I have been interested in exploring fundamental Computer Science data-structures and algorithms. As a result, I thought it would be interesting to explore this structure and associated operations in more depth - using immutable and mutable PHP implementations to clearly highlight the benefits garnered from each approach.

Binary trees are a form of tree data-structure, comprised of nodes with assigned values and at-most two child nodes (left and right). To expand on this problem I will be documenting the creation of an Binary Search Tree, which has the additional invariant that any left child be less than, and any right child be greater than, the current nodes value. This allows us to perform unambiguous node deletion from the structure.

Node Representation

Throughout all these examples I leaned towards a simple function, as opposed to class-based approach, relying on namespaces to infer relation. The following function can be used to create a simple object representation of a tree node. Providing a value and optional left and right nodes, it simply returns the aggregate.

function Node($value, $left = null, $right = null)
{
    return (object) compact('value', 'left', 'right');
}

Insertion

Now that we are able to represent tree nodes, the next logical step is to provide insertion capabilities. Accessible from the root tree node (initially NULL), we are able to recursively traverse the structure until we either find a leaf or the node already present. To decide which child to traverse down we use the discussed comparator invariant. Below is a small diagram depicting the insertion of a value within an existing tree.

Binary Search Tree Insertion

Mutable

The first implementation provides a mutable means of insertion. Notice the explicit reassignment of the right and left child node references.

function insert($value, $root)
{
    if ($root === null) {
        return Node($value);
    }

    if ($value === $root->value) {
        return $root;
    }

    if ($value > $root->value) {
        $root->right = insert($value, $root->right);
    } else {
        $root->left = insert($value, $root->left);
    }

    return $root;
}

Immutable

Below is an immutable implementation of the insertion process starting from a rooted tree node. As opposed to modifying pre-existing state, we instead build up a modified representation, creating new nodes when required. This allows us to the use both the new and old tree representations simultaneously.

function insert($value, $root)
{
    if ($root === null) {
        return Node($value);
    }

    if ($value === $root->value) {
        return $root;
    }

    if ($value > $root->value) {
        return Node($root->value, $root->left, insert($value, $root->right));
    }

    return Node($root->value, insert($value, $root->left), $root->right);
}

Removal

With the ability to now insert nodes into the tree, we can expand on this by performing the inverse operation, that being removal. When removing a node from the tree, there are three different use-cases that need to be addressed. The first two are relativity simple cases, met when the node in question has zero or one child. When no children are present we are able to just remove the reference to the node. However, in the case of a single child node we can replace the nodes parent reference with its child node. The third case is a little more tricky, requiring us to rearrange the structure to find a new node to replace this one, maintaining the desired invariant.

There are two common techniques to achieve this, either finding the in-order successor or in-order predecessor and replacing the current node with this result. In these examples I have opted for the in-order successor, which requires us to find the minimal value of the current nodes right tree. This operation can be succinctly codified recursively as shown below.

function minValue($root)
{
    if ($root->left === null) {
        return $root->value;
    }

    return minValue($root->left);
}

Below is a small diagram depicting the removal of a node which meets the third use-case.

Binary Search Tree Removal

Mutable

In a similar fashion to how mutable insertion can be carried out, we reassign both the nodes right and left references when required. We also replace the current nodes value, if the operation falls into the described third use-case.

function remove($value, $root)
{
    if ($root === null) {
        return $root;
    }

    if ($value > $root->value) {
        $root->right = remove($value, $root->right);

        return $root;
    }

    if ($value < $root->value) {
        $root->left = remove($value, $root->left);

        return $root;
    }

    if ($root->right === null) {
        return $root->left;
    }

    if ($root->left === null) {
        return $root->right;
    }

    $value = minValue($root->right);

    $root->right = remove($value, $root->right);

    $root->value = $value;

    return $root;
}

Immutable

In the immutable instance we instead return new nodes in place of the reassignment that would have occurred in the mutable version. This allows us to maintain and access the entire original tree structure, whilst reusing unmodified references in the new tree.

function remove($value, $root)
{
    if ($root === null) {
        return $root;
    }

    if ($value > $root->value) {
        return Node($root->value, $root->left, remove($value, $root->right));
    }

    if ($value < $root->value) {
        return Node($root->value, remove($value, $root->left), $root->right);
    }

    if ($root->left === null) {
        return $root->right;
    }

    if ($root->right === null) {
        return $root->left;
    }

    $value = minValue($root->right);

    return Node($value, $root->left, remove($value, $root->right));
}

Inversion

Now that we have the ability to insert and remove nodes from a binary tree, another operation which can be performed is inversion (like the tweet mentioned). I should note that this operation does not typically occur on Binary Search Trees as it violates the additional invariant.

Mutable

From a mutable perspective we are able to invert the tree in a memory efficient manner, with only references being altered.

function invert($root)
{
    if ($root === null) {
        return $root;
    }

    $tmp = $root->left;
    $root->left = invert($root->right);
    $root->right = invert($tmp);

    return $root;
}

Immutable

In the case of immutability, an entirely new tree is required to be built. As to maintain the ability to reuse the current tree we are not able to manipulate any of the existing nodes references.

function invert($root)
{
    if ($root === null) {
        return $root;
    }

    return Node($root->value, invert($root->right), invert($root->left));
}

Example

Now that we have these operations in place we can use the following function to generate a tree from an array representation. Notice the use of the mutable insert operation to save on memory costs.

function fromArray(array $values)
{
    $tree = null;

    foreach ($values as $value) {
        $tree = BinaryTree\Mutable\insert($value, $tree);
    }

    return $tree;
}

We can then visualise the generated tree by using the following render function. This function displays the tree from left to right, as opposed to the typical top down approach.

function render($root, $depth = 0)
{
    if ($root === null) {
        return str_repeat("\t", $depth) . "~\n";
    }

    return
        render($root->right, $depth + 1) .
        str_repeat("\t", $depth) . $root->value . "\n" .
        render($root->left, $depth + 1);
}

Finally, we are able to use all these operations and helper functions in conjunction, for a contrived example.

$a = BinaryTree\fromArray([ 2, 1, 3, 4 ]);

$b = BinaryTree\Immutable\remove(2, $a);

echo BinaryTree\render($a);

/*
                        ~
                4
                        ~
        3
                ~
2
                ~
        1
                ~

*/

BinaryTree\Mutable\invert($b);

echo BinaryTree\render($b);

/*
                ~
        1
                ~
3
                ~
        4
                ~
*/
Read the whole story
splicer
3465 days ago
reply
London
Share this story
Delete

Conductor: A return to monolith

1 Share

Conductor is a tool that allows you to gain the advantages of separate components without the downsides of having multiple repositories. Since the long needed arrival of Composer in the PHP world, having one repository per package has been propelled into mainstream and often used without any questioning.

For product development, in which different applications depend on internal packages - which is our case at MyBuilder.com - having a repository for each package caused a coordination overhead that slowed down development. It just didn't work for us.

Imagine you have three applications (admin, API and frontend) that depend on the same internal package and this package keeps changing. And every time the package changes you also have to update those three applications.

Coordination Overhead

Because of that we had to switch back to a monolith repository and developed a tool to fix the problem of working with multiple composer.json files in it.

We do not advocate against any approach and in fact we think both should be used. But if you are entangled in the above situation this post is perfect for you! :)

Let's explore the problem more deepily, so we can better understand it:

Multiple repository approach

This approach is good for open-source projects, specially when contribution and versioning is important.

Check out this post from Matthew Weier O'Phinney, it's excellent in explaining it.

Single repository approach

In a product development perspective, sharing code and having the latest version of your packages is mandatory as they contain business rules that change fast. In a single repository you also can unify tests and continuous integration (CI) in one tool and create single pull requests that affect more than one application when change occurs.

Not long ago in a distant galaxy

Now that we explained the problem really well (hopefully), let's talk a little bit about how we solved this problem.

Composer doesn't support multiple composer.json files in a single repository and we needed it to maintain isolation (that's why we use packages daaaahhh). We want everything in one place but still isolated! Interesting problem!!

We also wanted to make single pull requests when packages changed, specially if the changes impacted different applications. We wanted everything documented in one place.

And also we wanted unified tooling for tests and CI.

Oh gosh, and for us only the latest version of our packages matter, because our business rules change frequently.

With those problems in mind we decided to created the Conductor .

Conductor

What is Conductor

Conductor is a console application, which can be seen as a tool to help Composer manage internal packages.

When is it needed?

Conductor is needed when you have a single repository but wish to depend on multiple internal packages each with their own composer.json files.

How conductor solves the problem

We realised that we could use Composer's 'artifact' repository type as our internal packages source, but we cheated a bit as instead of zipping the package's source code into the artifact folder we just put instructions for Conductor and Composer in there.

Another problem that came across was that Composer generates absolute paths in the lock file even when we have given relative path to the artifact directory. We run our applications in different environments (i.e developer machines, dev server, production server), so changing the composer.lock absolute paths in the artifact files to relative paths was necessary.

Another thing was that it didn't feel right having multiple copies of the packages in the same repository so to fix it we symlink the packages in the vendor folder to their source folder, that's another important Conductor feature.

And now let's see in step-by-step fashion how Conductor works:

  1. It uses Composer artifact repository type as an internal packages source.
  2. When composer install or composer update is run only info about all the dependent packages are zipped and archived in the artifact folder.
  3. When running composer install, composer find the package's composer.json file in zipped package file the artifact/ folder so it can resolve the dependencies.
  4. After running composer update Conductor then re-writes the composer.lock file to use relative paths to the artifact folder.
  5. Before Composer dumps the autoloader, Conductor symlinks the packages in the vendor folder.

Coordination Overhead

As you can see in the above image, if we didn't have Conductor we would need to update five repositories if we ever needed to change Package2.

The benefits of using Conductor

Now with Conductor everything is fantastic! You get all the benefits for having everything in a single repository but not losing any for having isolated packages. Also we can now do single pull requests and use uniform tooling for tests and CI, what a dream!!

Read the whole story
splicer
3515 days ago
reply
London
Share this story
Delete
Next Page of Stories