PAIP (Paradigms of AI Programming) is one of the best programming books in existence, IMHO. In an article named A Retrospective on PAIP, Peter Norvig lists the most important lessons to be learned from his book (with references to the book's relevant pages):
- Use anonymous functions. [p. 20]
 - Create new functions (closures) at run time. [p. 22]
 - Use the most natural notation available to solve a problem. [p. 42]
 - Use the same data for several programs. [p. 43]
 - 
Be specific. Use abstractions. Be concise. Use the provided tools. Don't
be obscure. Be consistent. [p. 49] - Use macros (if really necessary). [p. 66]
 - 
There are 20 or 30 major data types; familiarize yourself with them. [p.
81] - 
Whenever you develop a complex data structure, develop a corresponding
consistency checker. [p. 90] - 
To solve a problem, describe it, specify it in algorithmic terms, implement
it, test it, debug and analyze it. Expect this to be an iterative process.
[p. 110] - 
AI programming is largely exploratory programming; the aim is often to
discover more about the problem area. [p. 119] - 
A general problem solver should be able to solve different problems. [p.
132] - 
We must resist the temptation to belive that all thinking follows the computational
model. [p. 147] - 
The main object of this book is to cause the reader to say to him or herself
"I could have written that". [p. 152] - 
If we left out the prompt, we could write a complete Lisp interpreter using
just four symbols. Consider what we would have to do to write a Lisp (or
Pascal, or Java) interpreter in Pascal (or Java). [p. 176] - 
Design patterns can be used informally, or can be abstracted into a formal
function, macro, or data type (often involving higher-order functions).
[p. 177] - 
Use data-driven programming, where pattern/action pairs are stored in a
table. [p. 182] - 
Sometimes "more is less": its easier to produce more output than just the
right output. [p. 231] - 
Lisp is not inherently less efficient than other high-level languages -
Richard Fateman. [p. 265] - 
First develop a working program. Second, instrument it. Third, replace
the slow parts. [p. 265] - 
The expert Lisp programmer eventually develops a
good "efficiency model". [p. 268] - 
There are four general techniques for speeding up
an algorithm: caching, compiling, delaying computation, and indexing. [p.
269] - We can write a compiler as a set of macros. [p. 277]
 - 
Compilation and memoization can yield 100-fold speed-ups.
[p. 307] - 
Low-level efficiency concerns can yield 40-fold speed-ups.
[p. 315] - 
For efficiency, use declarations, avoid generic functions,
avoid complex argument lists, avoid unnecessary consing, use the right
data structure. [p. 316] - 
A language that doesn't affect the way you think
about programming is not worth knowing - Alan Perlis. [p. 348] - 
Prolog relies on three important ideas: a uniform
data base, logic variables, and automatic backtracking. [p. 349] - 
Prolog is similar to Lisp on the main points. [p.
381] - 
Object orientation = Objects + Classes + Inheritance
- Peter Wegner [p. 435] - 
Instead of prohibiting global state (as functional
programming does), object-oriented programming breaks up the unruly mass
of global state and encapsulates it into small, manageable pieces, or objects.
[p. 435] - 
Depending on your definition, CLOS is or is not object-oriented.
It doesn't support encapsulation. [p. 454] - 
Prolog may not provide exactly the logic you want
[p. 465], nor the efficiency you want [p. 472]. Other representation schemes
are possible. - 
Rule-based translation is a powerful idea, however
sometimes you need more efficiency, and need to give up the simplicity
of a rule-based system [p. 509]. - 
Translating inputs to a canonical form is often a
good strategy [p. 510]. - 
An "Expert System" goes beyond a simple logic programming
system: it provides reasoning with uncertainty, explanations, and flexible
flow of control [p. 531]. - 
Certainty factors provide a simple way of dealing
with uncertainty, but there is general agreement that probabilities provide
a more solid foundation [p. 534]. - 
The strategy you use to search for a sequence of
good moves can be important [p. 615]. - 
You can compare two different strategies for a task
by running repeated trials of the two [p. 626]. - It pays to precycle [p. 633].
 - 
Memoization can turn an inefficient program into
an efficient one [p. 662]. - 
It is often easier to deal with preferences among
competing interpretations of inputs, rather than trying to strictly rule
one interpretation in or out [p 670]. - 
Logic programs have a simple way to express grammars
[p. 685]. - 
Handling quantifiers in natural languiage can be
tricky [p. 696]. - 
Handling long-distance dependencies in natural language
can be tricky [p. 702]. - 
Understanding how a Scheme interpreter works can
give you a better appreciation of how Lisp works, and thus make you a better
programmer [p. 753]. - 
The truly amazing, wonderful thing about call/cc
is the ability to return to a continuation point more than once. [p. 771]. - 
The first Lisp interpreter was a result of a programmer
ignoring his boss's advice. [p. 777]. - 
Abelson and Sussman (1985) is probably the best introduction
to computer science ever written [p. 777]. - 
The simplest compiler need not be much more complex
than an interpreter [p. 784]. - 
An extraordinary feature of ANSI Common Lisp is the
facility for handling errors [p. 837]. - 
If you can understand how to write and when to use
once-only, then you truly understand macros [p. 853]. - A word to the wise: don't get carried away with macros [p. 855].