As a consultant for ThoughtWorks I get to see a lot of different codebases. A pattern that I've seen in almost every codebase is that you can usually tell how well a codebase is written by the tests that cover the codebase.
The first indicator of quality is how covered the code is. In my experience the coverage percentage is generally representative of the confidence the business has in the software. Unfortunately, a low coverage percentage can show probable lack of quality, but a high coverage percentage does not ensure high quality. If you have less than 80% coverage, you probably have software that is broken in places. However, if you have better than 80% coverage it only means that you have better than 80% coverage. Again, coverage does not ensure quality.
The second indicator of quality is how easily new tests can be added to the existing suite. Working with someone for 10 minutes is usually enough time to see how much effort is required to add new tests (and features). If the developers believe in TDD, but are consistently frustrated with the level of effort it takes to write a test, you probably have a problem. Of course, if the developers do not believe in TDD, raising the barrier to entry with code that is hard to test only compounds the problem.
New tests can also be fairly easy to write, but very hard for new team members to understand. For example, special test superclasses designed to minimize the pain are also warning signs that the system may be unnecessarily complex.
Every project has a ratio of lines of application code and lines of test code. This ratio is effected by the language decision. The ratio does not need to be 1:1. In fact, language isn't the only factor in determining what the correct ratio is. For example, an internal application for a cafe that is not mission critical should not be as thoroughly tested as the command system for NASA's shuttles. The point is not to set some language standard ratio, but every project should recognize that if their desired ratio is 1:4 and it takes 8 times longer to write the tests than it takes to write the feature, something is wrong. The time to implement a feature versus the time to write the tests should be roughly equivalent to the application loc to test loc ratio.
While you do not need to strive for a 1:1 ratio, the farther you stray from 1:1 the more you should be looking at the reason for doing so. If your language requires a larger number, so be it. If your project demands more than the normal amount of tests, fair enough. But, you should always be on the look out for your ratio growing unjustifiably. It's always good to have at least one skeptic who believes that 1:1 is what you should be striving for. If you need 8+ lines of setup code to test a one line method you have a problem, and it pays to have a guy who is not afraid to point that out.
Conversely, I've rarely found a codebase that was easy to test that wasn't well written. I believe this is true largely because systems that are easily testable are generally very loosely coupled. Generally, loosely coupled applications are written fairly well to begin with. However, even if a component is poorly written, it will be easy to refactor or rewrite since it can be done independently of the rest of the system.
Along the same lines, when people ask me how to improve a legacy, monster codebase the first thing I recommend is breaking as many dependencies as possible. Once an application is broken into components it's possible to make educated decisions on what to refactor and what to redo from scratch.
Getting tests around a legacy codebase is always painful; however, it can also give you direction on where the application can be logically broken up. While writing tests for a legacy codebase you should keep track of dependencies that need to be setup, mocked or stubbed but have nothing to do with the current functionality you are focusing on. In general, these are the pieces that should be broken out into components that are easily stubbed (ideally in 1 or 0 lines).
Well written tests generally require loosely coupled, single responsibility code. The same type of code generally increases the quality of the codebase as a whole. Thus, tests that are readable and maintainable generally ensure the same type of codebase.