|
Cyclic Structures |
Top Previous Next |
|
Cyclic Structures
The most difficult aspect of writing SMILES notations is writing a correct SMILES notation for a complicated ring system! Writing SMILES notations for structures containing only one or two rings is fairly simple however. The following encoding rules apply to all cyclic structures:
(1) Cyclic structures require numbers to indicate where the ring starts and stops. The numbers 1 through 9 are used to indicate the starting and terminating atoms.
(2) The SAME number is used to indicate the starting and terminating atom for each ring. The starting and terminating atom must be connected to each other!
(3) Each number that is used (1, 2, 3, etc.) MUST appear twice and ONLY twice in the entire SMILES notation. This rule has an exception in the recent MS-Windows versions of the EPI Suite programs. A SMILES such as c1ccccc1c1ccccc1 is allowed...the programs convert this to c2ccccc2c1ccccc1.
(4) Numbers are entered immediately following the atoms used to indicate the starting and terminating positions. For example, a number should not follow a branch as in: c1ccccc(Br)1; this notation for bromobenzene should written as c1ccccc1(Br) or c1ccccc1Br.
(5) A starting or terminating atom can be associated with two consecutive numbers. For example, naphthalene can be coded as: c12ccccc1cccc2 (see the example below). The "12" following the first carbon indicates that the first carbon is connected to both of the following numbered carbons. Three consecutive number are not currently allowed by the EPI Suite programs.
Examples are the best way to understand SMILES notations for cyclic structures. Several examples are illustrated below. The following concept has been found useful for writing SMILES notations for ring systems: (a) select one ring from the entire structure and label the starting and terminating atoms with the number 1; (b) begin at the starting atom and "snake your way" (draw a free-hand line) through the cyclic structure so that the "snake" passes every ring member once and finishes at the terminating atom. Number each starting and terminating atom of each subsequent ring as it is passed by the "snake". For complicated structures, it may be quite a puzzle with many possible solutions. The key is to select an appropriate ring to start. Once the "snake" has been drawn, simply write the SMILES notation by starting at the initial atom and then follow the "snake". The "snake" in the examples below is the curved line that ends at the arrow head. The "snake" starts at the starting atom and ends at the terminating atom. Remember that aromatic atoms are entered in lower case.
Benzene SMILES: c1ccccc1
SMILES: C1=CC=CC1
The following examples illustrate ring systems where the rings are not connected to each other at two or more atoms (not fused):
In certain types of ring systems, it is impossible to draw the "snake" completely through all rings. In these situations, it is necessary to use "ring branching". The examples of benzene and acenaphthene below demonstrate ring branching; neither of these structures require it, but it is available. The strychnine structure example needs it; a SMILES can not otherwise be written.
|