Ok, I've had the opportunity to take a look in Smith's Modern Optical Engineering.

This is what it says [pp 23]:

The effective focal length (efl) of a system is the distance from the principal point to the focal point.

If the rays entering the system and those emerging from the system are extended until they intersect, the points of intersection will define a surface, usually referred to as the principal plane. /.../ The intersection of this surface with the axis is the principal point.

As for calculation of the (effective) focal length:

The book defines "effective" focal length (efl) as mathematically equal to "focal length" [pp 39]:

efl = f = -y

_{1}/u'

_{2}, where:

**y**_{1} is the height of an incident ray parallell to the optical axis,

**u'**_{2} is the angle (in radians) of the same ray after passing the optical system.

If

**n** and

**n'** are indices of refraction before and after an optical surface,

**u** and

**u'** are beam angles relative the optical axis before and after the surface,

and

**R** radius of curvature of the surface, then

n'u' = nu - y(n'-n)/R

These definitions assume paraxial rays, so the term "focal length" by definition only applies to paraxial rays, I guess?

When it comes to "effective focal length" vs "focal length", perhaps someone thought that "focal length" sounded too boring and felt the need to prepend "effective"?

The book also gives an excellent justification for using the paraxial approximation [pp 32] (my emphasis):

The paraxial region of an optical system is a thin threadlike region

about the optical axis which is so small that all the angles made by the

rays (i.e., the slope angles and the angles of incidence and refraction)

may be set equal to their sines and tangents. At first glance this con-

cept seems utterly useless, since the region is obviously infinitesimal

and seemingly of value only as a limiting case. However, calculations

of the performance of an optical system based on paraxial relation-

ships are of tremendous utility. Their simplicity makes calculation and

manipulation quick and easy. **Since most optical systems of practical**

value form good images, it is apparent that most of the light rays orig-

inating at an object point must pass at least reasonably close to the

paraxial image point.