The Daily Parker

Politics, Weather, Photography, and the Dog

Q is for Querying

Blogging A to ZPosting day 17 of the Blogging A-to-Z challenge just a little late because of stuff (see next post). Apologies.

Today's topic is querying, which .NET makes relatively easy through the magic of LINQ. Last week I showed how LINQ works when dealing with in-memory collections of things. In combination with Entity Framework, or another object-relational mapper (ORM), LINQ makes getting data out of your database a ton easier.

When querying a database in a .NET application, you will generally need a database connection object, a database command object, and a data reader. Here's a simple example using SQL Server:

public void DirectQueryExample(string connectionString)
{
	using (var conn = new SqlConnection(connectionString))
	{
		var command = new SqlCommand("SELECT * FROM LookupData", conn);
		conn.Open();
		var reader = command.ExecuteReader();
		foreach (var row in reader)
		{
			Console.WriteLine(reader[0]);
		}
	}
}

(Let's skip for now how bad it is to execute raw SQL from your application.)

With Entity Framework (or another ORM), the ORM generates classes that represent tables or views in your database. Imagine you have an animals table, represented by an animal class in your data project. Finding them in your database might now look like this:

public IEnumerable<Animal> OrmQueryExample(string species)
{
	var result = new List<Animal>();
	using (var db = Orm.Context)
	{
		var dtos = db.Animals.Where(p => p.Species == species);
		result.AddRange(dtos.ForEach(MapDtoToDomainObject));
	}

	return result;
}

private Animal MapDtoToDomainObject(AnimalDto animalDto)
{
	// Code elided
}

That looks a little different, no? Instead of opening a connection to the database and executing a query, we use a database context that Entity Framework supplies. We then execute a LINQ query directly on the Animals table representation with a predicate. The ORM handles constructing and executing the query, and returns an IQueryable<T> of its self-generated Animal data transfer object (DTO). When we get that collection back, we map the fields on the DTO to the domain object we want, and return an IEnumerable<T> of our own domain object back to the caller. If the query comes up empty, we return an empty list. (Here's a decent Stack Overflow post on the difference between the two collection types.)

These are naive examples, of course, and there are many other ways of using EF. For example, for field mapping we might use a package like AutoMapper instead of rolling our own field-level mapping. I encourage you to read up on EF and related technologies.

P is for Polymorphism

Blogging A to ZWe're now past the half-way point, 16 days into the Blogging A-to-Z challenge. Time to go back to object-oriented design fundamentals.

OO design has four basic concepts:

All four have specific meanings. Today we'll just look at polymorphism (from Greek: "poly" meaning many and "morph" meaning shape).

Essentially, polymorphism means using the same identifiers in different ways. Let's take a contrived but common example: animals.

Imagine you have a class representing any animal (see under "abstraction"). Animals can move. So:

public abstract class Animal
{
	public abstract void Move();
}

Notice that the Move method has no implementation, since animal species have many different ways of moving.

Now imagine two concrete animal classes:

public class Dog : Animal
{
	public override void Move() 
	{
		// Walk like a quadraped
	}
}

public class Guppy : Animal
{
	public override void Move() 
	{
		// Swim like a fish
	}
}

Guppies and dogs both move around just fine in their own environments, and dogs can move around in the littoral areas of a guppy's environment as well. So both animals have a Move method.

In this way, the Move method is polymorphic. A caller doesn't need to know anything about guppies or dogs in order to get them to move. And the implementations of the Move method will be completely different:

public void MoveAll(IEnumerable<Animal> animals)
{
	animals.ForEach(a => a.Move());
}

That method doesn't care what the list contains. It moves them all the same.

Now imagine this class:

public class Electron : Lepton
{
	public override void Move() 
	{
		// Walk like a quadraped
	}
}

Electrons move too. The implementation of Electron.Move() differs from Dog.Move() or Guppy.Move() so vastly that no one really knows how electrons do it. But if you call Electron.Move(), you expect the thing to move.

I've only given examples of subtyping and duck typing today, so it's worth reading more about polymorphism in general. Also, as you recall from my discussion of interfaces, you probably would also define an interface like IMovable to express that your class can move, rather than relying on the abstract classes and inheritance. (Program to interfaces, not to implementations!)

O is for O(n) Notation

Blogging A to ZFor day 15 of the Blogging A-to-Z challenge I want to talk about something that computer scientists use but application developers typically don't.

Longtime readers of the Daily Parker know that I put a lot of stock in having a liberal arts education in general, and having one in my profession in specific. I have a disclosed bias against hiring people with computer science (CS) degrees unless they come from universities with rigorous liberal arts core requirements. Distilled down to the essence, I believe that CS majors at most schools spend too much time on how computers work and not enough time on how people work.

But CS majors do have a lexicon that more liberally-educated developers don't really have. For example, when discussing how well code performs, CS majors use "Big O" notation.

In Big O notation, the O stands for "order of growth," meaning how much time could the algorithm could grow to take up given worst-case inputs.

The simplest notation is O(1), where the code always takes the same amount of time to execute no matter what the inputs are:

int q = 1;
int r = 2;
var s = q + r;

Adding two integers in .NET always takes the same amount of time, no matter what the integers are.

The titular notation for this post, O(n), means that the execution time grows linearly based on the input. Each item you add to the input increases the time the algorithm takes by exactly one unit, as in a simple for loop:

public long LoopExample(int[] numbers)
{
	long sum;
	for(var x = 0; x < numbers.Length; x++)
	{
		sum += numbers[x];
	}
	return sum;
}

In that example, each item you add to the array numbers increases the time the loop takes by exactly one unit of time, whatever that unit may be. (On most computers today that would be measured in units of milliseconds, or even hundreds of nanoseconds.)

Other algorithms may be slower or faster than these examples, except that no algorithm can be faster than O(1). The parenthetical can be any expression of n. Some algorithms grow by the logarithm of n, some grow by the double of n, some grow by the square of n.

This notation enables people to communicate precisely about how well code performs. If you find that your code takes O(n2) time to run, you may want to find a fix that reduces it to O(log n). And you can communicate to people how you increased efficiency in just that way.

So as much as I want application developers to have broad liberal educations, it's worth remembering that computer science fits into a liberal education as well.

N is for Namespace

Blogging A to ZDay 14 of the Blogging A-to-Z challenge brings us to namespaces.

Simply put, a namespace puts logical scope around a group of types. In .NET and in other languages, types typically belong to namespaces two or three levels down.

Look at the sample code for this series. You'll notice that all of the types have a scope around them something like this:

namespace InnerDrive.Application.Module
{
}

(In some languages it's customary to use the complete domain name of the organization creating the code as part of the namespace and to use alternate letter cases. If I were writing Java, for example, that would look like com.inner-drive.application.module.)

Every type defined in the namespace belongs to only that namespace. If I defined a type in the example namespace above called Foo, the fully-qualified type name would be InnerDrive.Application.Module.Foo. Because using FQTNs requires a lot of typing and makes code harder to read, .NET gives you another use of the namespace keyword:

using InnerDrive.Application.Module;

namespace InnerDrive.Application.OtherModule
{
	public class Bar
	{
		public void Initialize() 
		{
			// var foo = new InnerDrive.Application.Module.Foo() is not required
			var foo = new Foo();
			foo.Start();
		}
	}
}

Also, that Bar class belongs only to the InnerDrive.Application.OtherModule namespace, so another developer could create another Bar class in her own namespace without needing to worry about mine.

All caught up

Two weeks ago I started writing my A-to-Z posts and got all the way to today's before my life became nuts—as I knew it would—with 4 chorus-related events and a huge increase in my work responsibilities. And with the Apollo After Hours benefit this coming Friday, this weekend will be pretty full as well.

I use my email inbox as a to-do list, and right now it has 35 messages, 30 of which relate to the benefit. I'm very glad the A-to-Z Challenge gives us Sundays off, because I don't know how I'm going to get another week ahead by tomorrow night.

The performances were worth it, though.

M is for Method

Blogging A to ZAlphabetical order doesn't actually put topics in the best sequence for learning, so we've had to wait until Day 13 of the Blogging A-to-Z challenge to talk about one of the most basic parts of an object-oriented program: methods.

A method takes a message from an object and does something with it. It's the behavior part of the behavior-plus-data pairing that orients your objects in the OO universe.

In .NET, even though you define fields, events, properties, and methods on your classes, under the hood the CLR sees only fields and methods. Properties and events are basically special flavors of methods that C# syntax makes easier to understand for humans. (See Monday's post.)

Take this simple C# snippet:

public string Name { get; internal set; }

The compiled code for that property will look almost the same as the compiled code for this pair of methods with a backing field:

public string get_Name()
{
	return _name;
}

internal void set_Name(string value)
{
	_name = value;
}

private string _name;

In fact, the method pair should look very familiar to Java developers, since that language hasn't really kept up with the times, you know? (Java developers would call the simplified version "syntactic sugar," which is what people call things that make life simpler when their salaries depend on it being complicated. It's essentially every argument a Rails developer has with her .NET counterpart until the first time she needs to decouple the database from the front end. That's when the .NET guy shows her a coding horror from the VB3 era with a mournful warning not to let this happen to her. But I digress.)

To sum up: Methods change the data or behavior of an object, but C# prefers that you use properties to change data, events to express behaviors to external consumers, and methods to ask the object to do something.

The A-to-Z challenge is off tomorrow, but it will return next week with a basic tool of organizing your software, a basic tool of performance testing, a basic principle of OO design, and three other posts I haven't thought about yet.

L is for LINQ

Blogging A to ZDay 12 of the Blogging A-to-Z challenge will introduce you to LINQ, another way .NET makes your life easier.

LINQ stands for Language INtegrated Query, which Microsoft describes as follows:

Traditionally, queries against data are expressed as simple strings without type checking at compile time or IntelliSense support. Furthermore, you have to learn a different query language for each type of data source: SQL databases, XML documents, various Web services, and so on. With LINQ, a query is a first-class language construct, just like classes, methods, events.

LINQ does a lot of things, so let me show just a small example. Before LINQ, if you wanted to loop through a collection and filter for specific characteristics, you'd have to do something like this:

public static ICollection<Room> ForEachLooping(IEnumerable<Room> rooms, string filter)
{
	var result = new List<Room>();
	foreach (var item in rooms)
	{
		if (filter == item.Name) result.Add(item);
	}

	return result;
}

Here's the LINQ version; see if you can spot the difference:

public static ICollection<Room> LinqLooping(IEnumerable<Room> rooms, string filter)
{
	return rooms.Where(p => p.Name == filter).ToList();
}

LINQ adds a whole set of extension methods to the IEnumerable<T> interface, including Average, Sum, Sort, Join...basically, everything you can do with a SQL statement, you can do with a LINQ statement.

In fact, there's an alternate syntax that's even more SQL-like:

public static ICollection<Room> SqlishLinq(IEnumerable<Room> rooms, string filter)
{
	return
		(from r in rooms
		where r.Name == filter
		select r)
	.ToList();
}

Note that LINQ naturally operates on and returns IEnumerable<T>, not ICollection<T>, so I invoked the .ToList() method for easier testing. In fact you would want to return IEnumerable<T> so that you can easily chain methods that use LINQ, as LINQ doesn't evaluate the whole query chain until you try to use one of its results. Calling ToList() forces an invocation.

LINQ is super-powerful and super-handy in too many cases to enumerate* in this short post. But if you use ReSharper (see Tuesday's post), you will learn it super-quickly.

(* See what I did there?)

K is for Key-Value Pairs

Blogging A to ZThe Blogging A-to-Z challenge continues on Day 11 with key-value pairs and simple tuples.

A tuple is a finite ordered list of elements. In mathematics, you usually see them surrounded by parentheses and delineated with commas, like so: (2, 3, 5, 8, 13).

.NET has several generic Tuple classes with 2 through 7 items in the sequence, plus a KeyValuePair<TKey, TValue> structure that is the equivalent of Tuple<T1, T2>.

I'm actually not a fan of the Tuple class, though I get why it exists. I prefer naming things what they actually are or do. If you're doing mathematics and need a 3-item tuple, use Tuple<T1, T2, T3>. But if you're doing geography and you need a terrestrial coordinate, create an actual Node<Easting, Northing, Altitude> class and use that instead. (Or just add the Inner Drive Extensible Architecture from NuGet to your project and use mine.)

You probably can't avoid the KeyValuePair<TKey, TValue> structure, however. It's coupled to the Dictionary<TKey, TValue> class, which you will probably use frequently.

Example:

#region Copyright ©2018 Inner Drive Technology

using System.Collections.Generic;

#endregion

namespace InnerDrive.DailyParkerAtoZ.WeekOne
{
	public class KeyValuePairs
	{
		public void Add(string name, Room room)
		{
			_rooms.Add(name, room);
		}

		public Room Find(string name)
		{
			return _rooms.ContainsKey(name) ? _rooms[name] : null;
		}

		public void Remove(string name)
		{
			if (_rooms.ContainsKey(name)) _rooms.Remove(name);
		}
		
		private readonly Dictionary<string, Room> _rooms = new Dictionary<string, Room>();
	}
}

Under the hood, the Dictionary<string, Room> object uses KeyValuePair<string, Room> objects to give you a list of rooms. Note that the key must be unique inside the dictionary; you can't have two rooms called "cupboard under the stairs" or it will throw an exception. Also note the safety features in the code above: the demo class won't throw an exception if you try to find or remove a room that doesn't exist. (There's a philosophical question buried in there: why should or shouldn't it throw?)

As always, download and play with the code samples for more fun and enjoyment.

J is for JetBrains

Blogging A to ZFor day 10 of the Blogging A-to-Z challenge, I'd like to give a shout out to a Czech company that has made my life so much easier over the past five years: JetBrains.

Specifically, their flagship .NET accelerator tool ReSharper makes .NET development so much easier I can't even remember life without it. (If you've downloaded the code samples for this challenge, you may have seen either in the code or in the Git log references to ReSharper, usually when I turned off an inspection for a line or two.)

I'm just going to quote them at length on what the product does:

Code quality analysis

On-the-fly code quality analysis is available in C#, VB.NET, XAML, ASP.NET, JavaScript, TypeScript, CSS, HTML, and XML. ReSharper will let you know if your code can be improved and suggest automatic quick-fixes.

Code editing helpers

Multiple code editing helpers are available, such as extended IntelliSense, hundreds of instant code transformations, auto-importing namespaces, rearranging code and displaying documentation.

Code generation

You don't have to write properties, overloads, implementations, and comparers by hand: use code generation actions to handle boilerplate code faster.

Eliminate errors and code smells

Instant fixes help eliminate errors and code smells. Not only does ReSharper warn you when there are problems in your code but it provides quick-fixes to solve them automatically.

Safely change your code base

Apply solution-wide refactorings orsmaller code transformations to safely change your code base. Whether you need to revitalize legacy code or put your project structure in order, you can lean on ReSharper.

Compliance to coding standards

Use code formatting and cleanup to get rid of unused code and ensure compliance to coding standards.

Instantly traverse your entire solution

Navigation features help you instantly traverse your entire solution. You can jump to any file, type, or member in your code base in no time, or navigate from a specific symbol to its usages, base and derived symbols, or implementations.

I can't endorse this product strongly enough. Use ReSharper whenever you're using Visual Studio. It's worth it.