Cyclic Structures

Top  Previous  Next

Cyclic Structures

 

The most difficult aspect of writing SMILES notations is writing a correct SMILES notation for a complicated ring system!  Writing SMILES notations for structures containing only one or two rings is fairly simple however.  The following encoding rules apply to all cyclic structures:

 

(1) Cyclic structures require numbers to indicate where the ring starts and stops. The numbers 1 through 9 are used to indicate the starting and terminating atoms.

 

(2) The SAME number is used to indicate the starting and terminating atom for each ring.  The starting and terminating atom must be connected to each other!

 

(3) Each number that is used (1, 2, 3, etc.) MUST appear twice and ONLY twice in the entire SMILES notation.  This rule has an exception in the recent MS-Windows versions of the EPI Suite programs.  A SMILES such as  c1ccccc1c1ccccc1  is allowed...the programs convert this to c2ccccc2c1ccccc1.

 

(4) Numbers are entered immediately following the atoms used to indicate the starting and terminating positions. For example, a number should not follow a branch as in:  c1ccccc(Br)1; this notation for bromobenzene should written as c1ccccc1(Br) or c1ccccc1Br.

 

(5) A starting or terminating atom can be associated with two consecutive numbers. For example, naphthalene can be coded as:  c12ccccc1cccc2  (see the example below). The "12" following the first carbon indicates that the first carbon is connected to both of the following numbered carbons. Three consecutive number are not currently allowed by the EPI Suite programs.

 

Examples are the best way to understand SMILES notations for cyclic structures.  Several examples are illustrated below.  The following concept has been found useful for writing SMILES notations for ring systems:  (a) select one ring from the entire structure and label the starting and terminating atoms with the number 1; (b) begin at the starting atom and "snake your way" (draw a free-hand line) through the cyclic structure so that the "snake" passes every ring member once and finishes at the terminating atom.  Number each starting and terminating atom of each subsequent ring as it is passed by the "snake".  For complicated structures, it may be quite a puzzle with many possible solutions.  The key is to select an appropriate ring to start. Once the "snake" has been drawn, simply write the SMILES notation by starting at the initial atom and then follow the "snake". The "snake" in the examples below is the curved line that ends at the arrow head.  The "snake" starts at the starting atom and ends at the terminating atom.  Remember that aromatic atoms are entered in lower case.

 

 

_bm0

 

Benzene SMILES:  c1ccccc1

 

 

_bm1

 

 

SMILES:  C1=CC=CC1

 

 

 

_bm2

 

 

_bm3

 

 

 

_bm4

 

 

_bm5

 

 

  The following examples illustrate ring systems where the rings are not connected to each other at two or more atoms (not fused):

 

_bm6

 

 

In certain types of ring systems, it is impossible to draw the "snake" completely through all rings.  In these situations, it is necessary to use "ring branching".  The examples of benzene and acenaphthene below demonstrate ring branching; neither of these structures require it, but it is available.  The strychnine structure example needs it; a SMILES can not otherwise be written.

 

 

_bm7

 

 

 

_bm8