Cheminfo Retrieval Classes 1 and 2 in 2010

My first Chemical Information Retrieval class for the Fall of 2010 took place on Sept 23, 2010. This is the second time that I've taught the class as sole instructor and it was certainly convenient to have last year's wiki to build upon. The assignments are the same so it was helpful to be able to give students access to what students did last year as examples.

The key message from my introductory lecture was that it can be really difficult to find usable chemical information and that there are no shortcuts like relying on a true trusted source - those don't exist. I showed a few examples of emerging models - Open Access, Open Notebook Science, Collaborative Competition (like pharma companies sharing some drug data openly) and other Open Science initiatives.
I also announced that we would be doing something new in the Science3.0 theme (the semantic web). One of the assignments involves collecting 5 values from the literature for each of 5 properties for a compound of the student's choice. In addition to adding these values on the wiki, we will collect them in a format that is friendly to machines: a ChemInfo Validation Google Spreadsheet. Andrew Lang has agreed to help with adapting our previous code for solubility to creating web services for this application. For example, we can have a service that reports the mean and standard deviation for a particular property and chemical. Another could produce statistics for a given data source or compare peer reviewed vs non peer reviewed sources, etc. Since it will be possible to to call these web services from within a Google Spreadsheet or Excel it should enable much more sophisticated analysis of the data related to the "validity" of chemical information as it exists today.
I didn't record the first lecture but I have the slides below:

During the second lecture on September 30, 2010 I spent most of the time showing students how to use Beilstein Crossfire, SciFinder and ChemSpider to find values for chemical properties. The recording for the second lecture is available below:

IGERT NSF panel on Digital Science

On May 24, 2010 I was part of a panel in Washington for the NSF IGERT annual meeting. As I mentioned previously, it is encouraging to find that funding agencies are paying more attention to the role of new forms of scholarship and dissemination of scientific information.

My co-panelists included Janet Stemwedel, who talked about the role of blogging in an academic career, Moshe Pritzker, who made a case for using video to communicate protocols in life sciences and Chris Impey, who demonstrated applications of clickers and Second Life in the classroom.

We only had 10 minutes each to speak so the presentations were basically highlights of what is possible. Still, it was enough to stimulate a vigorous discussion with the audience. There was a bit of controversy about the examples I used to demonstrate the limitations of peer review in chemistry. People can misinterpret what we are trying to do with ONS - it certainly doesn't include bringing down the peer review system (not that we could anyway). But we have to face the situation that peer review does not validate all the data and statements in a paper. It operates at a much higher level of abstraction. Providing transparency to the raw data should work in a synergistic way with the existing system.

My favorite part of the conference was easily Seth Shulman's talk on the "Telephone Gambit". Ever since reading his book, I have been using the story of how carefully reading Bell's lab notebook has forced us to revise the generally accepted notion of how the telephone was invented. Seth's presentation was truly captivating because he explained not only what was done but also what motives were at work to deceive and obfuscate. This cautionary tale is still very much relevant to science and invention today - and highlights how transparency can mitigate against this type of outcome.

Reaction Attempts Explorer

Two months ago I reported on the Reaction Attempts project and the availability of the summary as a physical or electronic (PDF) book. The basic idea behind the project is to collect organic chemistry reaction attempts reported in Open Notebooks. This would include not only successful experiments but also those which could be categorized as failed, ambiguous, in progress, etc.

The book was organized with reactants listed alphabetically. In this way one could browse through summaries of the types of reactions being attempted by different researchers on a reactant of interest. There might be information there (what to do or what to avoid) of some use for a planned reaction. At the very least one could contact the researcher to initiate a discussion about work that had not yet been published in the traditional system.

Andrew Lang has just created a web-based tool to explore the Reaction Attempts database in much more sophisticated ways.

Here are some scenarios of how one could use it. On the left hand side of the page is a dropdown menu containing an alphabetically sorted list of all the reactants and products in the database. Lets select furfurylamine.


This immediately informs us that there are 230 reactions involving furfurylamine and it lists the schemes for all these reactions upon scrolling down. That's still a bit hard to process so a second dropdown menu appears populated with a list of other reactants or products involved with furfurylamine.

We now select boc-glycine and that narrows our search to 145 reactions.

Selecting benzaldehyde from the third dropdown menu narrows the search further to 61 reactions.

The final dropdown menu contains a short list of only isocyanides and thus all represent attempted Ugi reactions. Selecting t-butyl isocyanide gives us 56 reactions.

That means that these same 4 components were reacted together 56 times. Looking at the various reaction summaries will show that some of these are duplicates for reproducibility and others vary concentration and solvent and the effect on yield is included. This particular reaction was in fact the subject of a paper on the optimization of a Ugi reaction using an automated liquid handler.

Now here is where the design of the Explorer comes in handy. We might want to ask if the reaction proceeds as well with the other isocyanides. All we have to do is switch the final dropdown menu to ask what happens when we go from t-butyl to n-butyl isonitrile. There is a single attempt of this reaction and it is "failed" in the sense that no precipitate was obtained from the reaction mixture. This doesn't mean that the reaction didn't take place - it might be that the Ugi product was too soluble. We can quickly inspect that the concentration and solvent are in line with conditions that allowed precipitation of the t-butyl derivative.

OK lets see what happens with n-pentyl isocyanide.

It looks like it behaves just like n-butyl isocyanide: another single non-precipitation event. What about benzyl isocyanide?

This time we do get the Ugi product from a single attempt. Note the lower yield compared to the t-butyl isocyanide under similar conditions.

What about with cyclohexyl isocyanide?

This time we hit an experiment in progress. A precipitate was obtained but it was not characterized. We can click on the link to the lab notebook page (EXP232) to learn more about how long it took for the precipitate to appear but there are not enough data to draw a definite conclusion about the successs of the reaction. However, based on the results from the other precipitates in this series it is probably encouraging enough to repeat and characterize the product.

There are other sources of information here. Clicking on the image of the Ugi product takes us to its ChemSpider entry. In this case the only associated data relates to this reaction attempt.

Lets look at another scenario: reactions involving aminoacetaldehyde dimethyl acetal.

In this case we find the intersection of two Open Notebooks. The first reaction comes from Michael Wolfle from the Todd group.

The second comes from Khalid Mirza from the Bradley group.

In order to learn more about the nature of the overlap we can use the substructure search capabilities of the Reaction Explorer. Simply click on the image of the acetal and the ChemSpider entry pops up. Now click on the copy button next to the SMILES for the compound.

Paste the SMILES into the SMARTS box of the Reaction Explorer.

We get 13 reaction attempts for this query - the two we found earlier and the rest corresponding to attempts by Michael Wolfle to synthesize praziquanamine.

We learn that one connection between these two notebooks involves different attempts at synthesizing praziquantel.

Hopefully this demonstrates the value of abstracting organic chemistry reaction attempts from Open Notebooks into a machine readable format. Contributions to the database require only the ChemSpider IDs of the reactants and product and a link to the relevant lab notebook page. Reaction schemes are automatically generated by the system. More on the Reaction Attempts project here.

ChemTaverna Workflows of ONS Web Services now on MyExperiment

I'm pleased to report that one of the collaborations initiated at the Berkeley Open Science conference last month is progressing very well.

Carole Goble introduced me to Peter Li who runs the ChemTaverna project. The idea was to use Taverna to construct workflows using the web services developed by Andrew Lang for our Open Notebook Science projects: UsefulChem and the ONS Solubility Challenge.
Peter quickly created several workflows to demonstrate what is possible. Here is a workflow that uses a Google Spreadsheet as input. SMILES for amines, carboxylic acids, aldehydes and isonitriles are entered in the appropriate columns. The workflow first creates a virtual library of Ugi products from all possible combinations of reactants. Then each product is submitted to a web service that predicts the solubility in methanol, the most common solvent for Ugi reactions.

The resulting spreadsheet can then be sorted by predicted solubility to recommend products that are more likely to precipitate from the reaction mixture. In this particular example Ugi products derived from boc-glycine are predicted to have a low solubility in methanol. The least soluble compound is predicted to have a solubility of only 0.07MIn this library, Ugi products derived from boc-methionine are predicted to be too soluble to precipitate. For example this Ugi product has a predicted solubility of 3.7 M.
(note: ChemSpider has a tendency to draw the minor tautomer for some amides and carbamates)

There are a few issues to take into consideration in order to use this particular workflow:
1) This will only work on Taverna Workbench 2.1.2 with these plug-ins installed. At one point it will be made to work on Taverna Workbench 2.2 and uploaded onto MyExperiment. The workflow used here is currently available here.
2) The SMILES in the input Google Spreadsheet must be written in the format of the current example (aldehyde, amine and isonitrile groups on the left and carboxylic acid groups on the right)
3) All of the Ugi products in the virtual library must already exist in ChemSpider. Otherwise, the solubility predictions will fail because of missing descriptors as discussed previously.
Peter has uploaded simpler workflows onto MyExperiment that are compatible with the current version of Taverna Workbench (v2.2).
First, the generation of Ugi product libraries from reactant SMILES in a Google Spreadsheet is available here.

Another workflow handles the prediction of Abraham descriptors.
This workflow processes the prediction of solubility for a given solute and solvent.

The main rationale for incorporating web services derived from our Open Notebook Science projects into Taverna is leverage. MyExperiment already benefits from a vigorous community of developers in the bioinformatics arena. With the growth of the ChemTaverna initiative, the integration of cheminformatics and bioinformatics workflows should become seamless.

By making our solubility and chemical reaction web services available in formats that are convenient for others to use it increases the opportunities that our work will be actually useful. It also makes it easier for us to leverage the resources made available by others for our own applications in drug discovery and reaction design.

Essentially this means that we have extended the reach of the information cascade triggered by the recording of an experiment in a laboratory notebook and a very simple abstraction process to represent that experiment in a semantically addressable format.

ASMS: Anthrax attacks

Ever since the infamous US anthrax attacks of 2001, where envelopes containing anthrax spores were mailed to a number of media outlets and two US Senators, there has been a push to develop new ways of determining the severity of anthrax infections.

John Barr, of the US Centers for Disease Control and Prevention (CDC), has developed a new, more sensitive way of monitoring the level of infection in a victim. This is keenly important as the symptoms for anthrax infection start off looking much like a cold or the flu, but can then lead to a subject deteriorating rapidly – often leading to death, even after treatment. According to Barr some 40 per cent of the victims of the 2001 anthrax letters died.

The Bacillus anthracis bacterium produces two different t toxins, the oedema factor and the lethal factor. Barr has developed a way of detecting both of these using a liquid chromatography – mass spectrometry (LC-MS) approach that can provide earlier diagnosis than any other technique. This is particularly important as providing antibiotics at an early stage in the infection can increase the odds of survival.

His method, which uses an antibody purification step to extract the toxins, can detect the toxins at concentrations as low as 25pg/ml in about two hours. If the antibody extraction step is left for around 16 hours, that detection limit can fall as low as 5 pg/ml.

The progression of the infection tends to go through a brief remission, and the changes in lethal factor levels correlate with the clinical symptoms – and during remission other methods that rely on detecting the bacteria themselves often fail during this stage.

Barr believes his results should enable clinicians to predict the clinical outcome of an infection, which could prove immensely important as there have recently been a number of anthrax poisoning cases in Scotland, after heroin addicts injected themselves with anthrax-contaminated spores.

Matt Wilkinson

This week on Chemistry World…

1 June 2010: Have something to say about an article you’ve read on Chemistry World this week? Leave your comments below…

This week’s stories…

Basic research bill backed in US
US bill that boosts science funding passes on third attempt after Democrats employ unusual procedural tactic

Universities face hard years ahead
Funding cuts to universities across Europe as a result of the economic crisis will impact teaching and research quality for years to come, says report

Structural order gained over conducting polymer
Researchers have used copper as both catalyst and template to gain structural control over an important conducting polymer

Liquid marbles detect gases
Scientists use porous properties of liquid marbles to develop gas sensors

Instant insight: Cosmic dust as chemical factories
Daren Caruana and Katherine Holt discuss how electrochemistry could be the missing link to understanding chemistry in space

Use of ONS to protect Open Research: the case of the Ugi approach to Praziquantel

As we were collecting reactions from The Synaptic Leap for the Reaction Attempts project, Andrew Lang noticed that there might be a quick synthetic route to praziquantel via a Ugi reaction. I researched it further and found a paper (Kim et al 1998) where Ugi product 1 was indeed converted to racemic praziquantel via the Pictet-Spegler cyclization.


Using Beilstein Crossfire the only synthesis of 1 I found involves a multi-step amidation strategy. But this compound should be accessible in one step from commercially available starting materials via a Ugi reaction (shown above). Since all the starting materials are liquids we have some flexibility with solvent choice. Khalid first tried it in methanol EXP258 a few weeks ago but did not get a precipitate. He was going to monitor it by NMR next to see if the problem was high solubility of the Ugi product or with the reaction itself.

It was therefore with great interest that I read Mat Todd's report this morning on The Synaptic Leap that a German patent had been issued on this Ugi strategy to praziquantel. (TSL didn't provide a means of leaving a comment so I edited the page - which made me the author of that post but actually Mat wrote it)

I have often mentioned during my talks that Open Notebook Science could be used not only in a defensive manner to claim academic priority - but also as an offensive tactic to block patent applications. A company attempting to prevent the commercial exploitation of rival inventions has a few options. Where applicable, it can buy up an existing patent pool with the intention of sitting on it. For new inventions, it can do research and try to file patents before their competitors. But this is a costly process and it may make more sense to simply publish the inventions to create disclosed prior art, thereby blocking patent applications of their competitors.

But - as I and many others have discussed - the current publication system is not optimally suited for the purpose of simply disclosing and communicating science. Not only is it generally slow but the traditional article format requires a narrative of some sort - rarely can single experiments be published. This means that much (if not most) of research done by an individual or group will never be disclosed.

For these reasons I think that keeping an easily discoverable Open Notebook for projects designed to block patent submission by competitors makes a lot of sense - both economically and from a workflow perspective. Since researchers already have to keep a lab notebook, making it public doesn't impose the added time that writing an article or patent will require.

In this specific example of praziquantel we were too late. But if we had recorded this experiment a few years ago it might have worked to block Domling's patent. Now, it isn't clear to me that EXP258 would have been enough to do that. The strategy to make praziquantel via a Ugi reaction was clearly stated but the experiment was not conclusive. However, since Domling reported that methanol worked I am sure that we would have had the "reduced to practice" evidence in the notebook shortly.

Above I used a company as an example of a party motivated to disclose inventions to protect their interests. In our case it would not be a company but rather the entire Open Science community. It is in our best interest to keep our scientific territory as unencumbered by patents as possible. Keeping Open Notebooks might be one of the simplest means of ensuring that.

Consider a humanitarian organization that might want to manufacture praziquantel. I haven't researched it but presumably the Domling patent was filed in a number of countries beside Germany. In order to consider using the Ugi strategy, the organization would now have to deal with the patent holder. This might be the factor that makes this route untenable. Patents have proven to be problematic for humanitarian aid - even in the simple case of providing food.

But all is not lost. In addition to offering a simple 2-step synthesis of praziqantel, the Ugi route offers an easy way to make large libraries of analogs. Optimally we would like to work with someone who has experience with docking praziquantel. It might be interesting to screen not only the praziquantel analogs but also the uncyclized Ugi products themselves. When we did this for malarial enoyl reductase inhibitors (D-EXP005) we found that we did not need to cyclize to obtain compounds predicted to bind. This ultimately led to active compounds.

The Reaction Attempts Solvent Selector

The ONS Solubility Challenge and the Reaction Attempts project have now been integrated with code written by Andrew Lang to the point that recommendations for solvents are just a click away.

First use the Reaction Attempts Explorer either using the drop-down menus or substructure search as described previously. When a reaction of interest is identified just click on the the link for "Optimal Solvent Prediction".
The service will then provide a summary of solubility measurements and predictions, organized by the default criteria of minimum 0.3 M solubility of reactants, maximum 0.03 M solubility of the product and maximum solvent boiling point of 100 C. Liquid reactants (or reactants with melting points within 15 C of room temperature) are excluded since these generally have a high enough solubility in most solvents.
In the case of the Ugi reaction in this example, only the solubility of boc-glycine and the product are considered.

The results are color-coded. In this case 14 solvents are coded green, indicating that all criteria were met. The fifteenth solvent is coded yellow, indicating that one of the criteria was not met - in this case the boiling point of 205 C is outside of the limit of 100 C. High boiling point solvents are not optimal for quickly obtaining the product as a dry solid after filtering. This criterion can be changed in the input fields at the top of the page. It is also possible to change the number of times the product is washed there. This will only change the estimated yield, which is based on carrying out the reaction at the concentration of the least soluble reactant, up to 1 M.

Three columns are generated for the product and each reactant. The column on the right is the average of all measurements, as recorded in the SolubilitiesSum Spreadsheet. The middle column is a solubility prediction based on Abraham descriptors derived from experimental values, as described and used in the ONS Solubility Challenge book. The column on the left contains predictions from the Abraham001 model, which is based on calculated molecular descriptors only.
The numbers in bold represent the best solubility value available for each solvent. If a measurement is known, that will be the number used. If no measurement is available, the experimental Abraham descriptor model is used. If neither of these are available the predictions from the Abraham001 model are used by default.
From the list of solvents in the green section we find ethanol and acetonitrile. Both of these solvents were tried (as mixtures with methanol) in the optimization of this reaction (Bradley et al JoVE 2008) and provided good to intermediate results. THF was found to give low yields for this reaction and it scores at #51 in the yellow section, with a high solubility of the product accounting for the missed criterion.
One should keep in mind that this is just a tool to flag potentially interesting solvents. Common chemical sense needs to be used as well. For example, acetone and butanone are listed in the green section but these are incompatible with the Ugi reaction since they would compete with the aldehyde.
Note that the predictive models are way off in some cases. For example the Abraham001 model dramatically underestimates the solubilities of boc-glycine in the green section, while the measured Abraham descriptor model does much better for these cases. We will prioritize our next solubility measurements to try to improve the models - or at least understand what types of compounds are most likely to yield useful solubility estimates from these models.
In addition to being called from the Reaction Attempts Explorer, the Solvent Selector can be used for any compounds that have ChemSpider IDs. Simply separate the CSIDs with the pipe character:
After modifying the criteria and hitting update, the new criteria are conveniently represented in the URL in this format, making sharing a specific search with anyone easy:
It is even possible to use the service listing just one compound's CSID - this is useful for quickly comparing the measured solubilities with predictions from both models:

Green Solvent Metric on Solvent Predictor

In the spirit of contributing to Peter Murray-Rust's initiative to collect Green Chemistry information, Andrew Lang and I have added a green solvent metric for 28 of the 72 solvents we include in our Solvent Selector service. The scale represents the combined contributions for Safety, Health and Environment (SHE) as calculated by ETH Zurich.

For example consider the following Ugi reaction solvent selection. Using the default thresholds, 6 solvents are proposed and 5 have SHE values. Assuming there are no additional selection factors, a chemist might start with ethyl acetate with a SHE value of 2.9 rather than acetonitrile with a value of 4.6.

Individual values of Safety, Health and Environment for each solvent are available from the ETH tool. We are just including the sum of the three out of convenience.

Note that the license for using the data from this tool requires citing this reference:
Koller, G., U. Fischer, and K. Hungerbühler, (2000). Assessing Safety, Health and Environmental Impact during Early Process Development. Industrial & Engineering Chemistry Research 39: 960-972.

Resveratrol Thesis on Reaction Attempts

A few days ago Andrew Lang suggested to Dustin Sprouse that he submit his thesis to the Reaction Attempts database. Like many undergraduates Dustin put in a lot of time and effort in doing experiments and writing up his results but didn't have quite enough time to obtain all that would have been required for a traditional publication.

A thesis is an unusual document within the context of scientific communication. Unlike a peer reviewed paper, it may contain a large number of "failed experiments" and a substantial amount of speculation. Although it is not quite as detailed as lab notebook, there is often plenty of raw data and details about how failed or ambiguous experiments proceeded.
In Dustin's case we felt that there was enough information provided to include his thesis in Reaction Attempts. In addition, his thesis was accepted by Nature Precedings, thus providing a convenient means of citation.
The first component of the Reaction Attempts project is to quickly abstract the most basic information from synthetic organic chemistry reactions. This includes the ChemSpiderIDs and SMILES from the reactants and target products and brief notes about conditions and outcomes. We are especially interested in failed or ambiguous experiments because these have almost no chance of being communicated and indexed in the traditional systems. When attempting to carry out a reaction, it can be just as useful to know what doesn't work - and more specifically how it doesn't work.
The second component of the project is dissemination. Because the information is encoded semantically, it can be automatically converted to both human and machine readable formats.
One human interface consists of a PDF book (also as a hard copy), with the option of selected reactions specified by listing CSIDs of reactants in the URL. For example Dustin's reactions can be presented selectively here. We also have a Reaction Explorer, where reactants or products can be selected from a dropdown menu or via a substructure search.
We also provide live XML feeds so that others can create applications easily from machine readable data. For example one could create reaction chains automatically, which will occur whenever we enter reactions from multi-step syntheses like Dustin's - based on the synthesis of resveratrol.
I know that Peter Murray-Rust has been very active in automatically abstracting information from chemistry theses. It would be interesting to see how that approach would work for this thesis, especially with the failed experiments. Reducing a page or two of text into only the most salient bits of information manually required a level of judgement that I imagine would be tricky to do automatically.

General Transparent Solubility Prediction using Abraham Descriptors

Making solubility estimations for most organic compounds in a wide range of solvents freely available has always been a main long term objective for the Open Notebook Science Solubility Challenge. With current expertise and technology, it should be as easy to obtain a solubility estimate as it is now to get driving directions off the web.

Obviously this won't be attained purely by exhaustive measurements, although we have been focused on strategic measurements over the past two years. In parallel, we have been constantly evaluating the various solubility models out there for suitability.
Although there are several solubility models available for non-aqueous solvents, our additional requirement for transparent model building has proved surprisingly difficult to satisfy.
From this search, the Abraham solubility model [Abraham2009] floated to the top, with an important factor being that Abraham has made available extensive compilations of descriptors for solutes and solvents. In addition the algorithms used to convert solubility measurements to Abraham descriptors (a minimum of 5 different solvents per solute) has allowed us to generate our own Abraham descriptors automatically simply by recording new measurements into our SolSum Google Spreadsheet. These can be obtained in real time as well.
This approach permitted us to provide predictions for a limited number of solvents in a wide range of solvents and we have included these predictions in the past two editions (2nd and 3rd) of the ONS Challenge Solubility Book.
Coming at the problem from a different approach, Andrew Lang has also been trying to predict solubility using only open molecular descriptors, mainly relying on the CDK. Since our most commonly used solvent has been methanol, Andy recently generated a web service to predict solubility in that solvent.
By combining these two approaches, Andy has now created a modeling system that can not only generally predict solubility in a wide range (70+) of solvents - but it can also provide related data that can be used for modeling other phenomena such as intestinal absorption of a drug or crossing the blood-brain barrier.[Stovall 2007]
The idea is to use a Random Forest approach using freely available descriptors to predict the Abraham descriptors for any solute. A separate service then generates predicted solubilities for a wide range of solvents based on these Abraham descriptors. I'm using the term "freely available" because - although the CDK descriptors and VCCLab services are open - the model requires 2 descriptors only available from ChemSpider (ultimately from ACD/Labs).
Here is an example with benzoic acid. As long as the common name resolves to a single entry on ChemSpider, it is enough to enter it and it automatically populates the rest of the fields, which are then used by the service to generate the Abraham descriptors.
Hitting the prediction link above will automatically populate the second service and generate predicted solubilities for over 70 solvents.
This approach of allowing people to access these components separately can be useful. It can be instructive to manually play with the Abraham descriptors directly to see how predicted solubilities are affected. There are also situations where one has experimentally determined Abraham descriptors for a solute and bypassing the descriptor prediction step is required.
However, for those who prefer to cut to the chase, a convenient web service is available where the common name (or SMILES) of the solute is entered and the list of available solvents appears as a drop down menu.
Now here is where I think the real payoff comes for accelerating science with openness. Andy has also created a web service that returns the predicted solubility in molar as a number from common names (or SMILES) for solute and solvent via the URL. For example click this for benzoic acid in methanol. The advantage here is that solubility prediction can be easily integrated as a web service call from intuitive interfaces such as a Google Spreadsheet to enable even non-programmers to make use of the data. Notice that the web service provided in the fourth column for the average of measured solubility values enables an easy way to explore the accuracy of specific predictions.
Such web services could also be integrated with data from ChemSpider or custom systems. If those who use these services feed back their processed data to the open web, it could take us a step closer to automated reaction design. For example consider the custom application to select solvents for the Ugi reaction. Model builders could also use the web services for predicted and measured solubility directly.
A
while back we explored using Taverna for MyExperiment to create virtual libraries of SMILES. Unfortunately we ran into issues with getting the applications developed on Macs to run on our PCs. This might be worth revisiting as a means of filtering virtual libraries through different thresholds of predicted solubility.
Andy has described his model in detail in a fully transparent way - the model itself, how it was generated and the entire dataset can be found here. We would welcome improvements of the model as well as completely new models based on our dataset using only freely available tools.
It should be noted that when I use term "general" it refers to the ability for the model to generate a number for most compounds listed in ChemSpider. Obviously compounds that most closely resemble the training set are more likely to generate better estimates. Because of our synthetic objectives using the Ugi reaction we have mainly focused on collecting solubility data for carboxylic acids, aldehydes and amides either from new measurements or from the literature.
Another important point concerns the main intended application of the model: organic synthesis. Generally the range of interest for such applications is about 0.01 - 3M. This might be very different for other applications - such as the aqueous solubility of a drug, where distinctions between much lower solubilities may be important.
For a typical organic synthesis, a solubility of 0.001M or 0.005M will probably translate as effectively insoluble. This might be a desired property for a product intended to be isolated by filtration. On the other end of the scale knowing that a solubility is either 4M or 6M will not usually have an impact on reaction design. It is enough to know that a reactant will have good solubility in a particular solvent.
Given the above considerations for intended applications and the likelihood that the current model is far from optimized, the predictions should be used cautiously. We suggest that the model is best used as a "flagging device". For example, if a reaction is to be carried out at 0.5M, one may place a threshold at 0.4M for the predicted values of reactants during solvent selection, with the recognition that a predicted 0.4M may be an actual 0.55M. A similar threshold approach can be used for the product, where in this case the lowest solubility is desired. A practical example of this is the shortlisting of solvents candidates for the Ugi reaction.
Another example of flagging involves identifying the outliers in the model. These can be inspected for experimental errors and possibly remeasured. Alternatively outliers may shed light on the limitations of the model. For example we have found that the solubility of solutes with melting points near room temperature can be greatly underestimated by the current model. This may be an opportunity to develop other models which incorporate melting point or enthalpy of fusion.[Rohani 2008]
Although it is possible that better models and more data will improve the accuracy of the predictions, this can be true only if the training set is accurate enough. Based on conversations I've had with researchers who deal with solubility, reading modeling papers and our own experience with the ONS Challenge I am starting to suspect that much of the available data just isn't accurate enough for high precision modeling. Models using data from the literature are especially vulnerable I think. Take a look at this unsettling comparison between new measurements and literature values (not to mention the model) for common compounds.[Loftsson 2006] Here is a subset:
I have also made the point in detail for the aqueous solubility of EGCG. Could this be the reason that so many different solubility models using different physical chemistry principles have evolved and continue to co-exist?
The situation reminds me a lot of the discussions taking place in the molecular docking community.[Bissantz 2010] The differences in calculated binding energies are often small in comparison with the uncertainties involved. But docking can still be used as one tool among others to find drug candidates by flagging a collection of compounds above a certain threshold binding energy.

Berkeley Open Science Summit 2010 Notes

I just returned from the Open Science Summit held at Berkeley July 29-31, 2010.

There certainly was an impressive list of presenters as well as attendees. Many of the talks were quite good, although several on the last day were more about closed collaborations than Open Science. During these presentations the assumption that patents are required to exploit discoveries in health care was repeated. This was in sharp contrast to the second day's session on gene patents, where IP protection was shown to stifle innovation and the exploitation of discoveries.
A refreshing exception to this pattern on the last day was Andrew Hessel's presentation on the Pink Army Cooperative. Andrew's strategy to cure cancer is based on the idea of customizing drugs for each individual affected by the disease. Since each drug is only applicable to one individual, the approach of expensive clinical trials doesn't apply. Since he is not interested in generating a profit from selling the drugs, IP protection also doesn't apply and allows him to make every part of the drug design process, including genetic analysis, publicly available. It wasn't clear if such an approach would be legal in the US but he did mention going to another country if necessary. Although he didn't currently have cancer, he did indicate that he might have need of this technology one day by pulling out a pack of cigarettes in the middle of his talk.
Unfortunately my panel on Open Data was canceled at the last minute due to time management problems (see FF discussion on how it happened). However, I did have a chance to generally catch up with old friends (Carmen Drahl, Joanna Scott, Cameron Neylon, Jack Park).
I also discussed some promising collaborations with several people:
1) CoLab. I spoke at length with DJ Sprouse and Casey Stark about their system for scientific collaboration. We will try to represent one solubility experiment from the ONS Challenge notebook and one organic synthesis experiment from the UsefulChem notebook to see how the information can be represented within CoLab. There may be some opportunities to visualize raw data in new ways - perhaps using non-Java tools to interact with JCAMP-DX spectra.
2) IPzero Principles. I continued a conversation with Lisa Green started with John Wilbanks and Thinh Nguyen at Creative Commons about coming up with a series of simple recommendations for ensuring that an Open Notebook can effectively prevent the patenting of inventions within an area of interest to the Open Science community.
3) Open Chemistry Reactions. I had the chance to discuss our Reaction Attempts database with Peter Murray-Rust over breakfast on Saturday. He also showed me how he is using Oscar to extract chemical reaction information from various documents. Peter suggested that we pool together our data for a demonstration in September at the London Science Online Conference. Reaction Attempts will cover the reactions done in the UsefulChem and the Todd group's Open Notebooks. Peter will extract information from both patents and Acta Crystallographica.
4) ChemTaverna. I was pleased to learn from Carole Goble that Taverna is extending its coverage to cheminformatics applications with the ChemTaverna project. I had just mentioned that we would be interested in revisiting Taverna for creating virtual libraries of organic compounds and filtering them based on predicted solubilities in various solvents. This would allow us to contribute cheminformatics workflows to MyExperiment. Carole put me in touch with the project leader Peter Li at the University of Manchester.

Secrecy in Astronomy and the Open Science Ratchet

Probably because of the visibility of the GalaxyZoo project, I think several of my colleagues and I have been under the impression that astronomy is a somewhat more open field than chemistry or molecular biology. It was easy to rationalize such a position because patents are not an issue, as they clearly are in fields which rely more on invention than discovery. However, after reading "The Case for Pluto" by Alan Boyle, I am left with a much different impression.

This book does an excellent job of covering the recent debate over Pluto's designation as a true planet. A key trigger for this debate has been the discovery of dwarf planets with sizes very close to that of Pluto. However, these discoveries did not occur without controversy.

The story of the controversy regarding the discovery of Haumea is a particularly good example (starts on p. 108 of the book - a good summary also on Wikipedia). Starting in December 2004 Michael Brown at Caltech discovered a series of new dwarf planets. Instead of immediately reporting his team's discoveries, he worked in secrecy until July 20, 2005 when he posted an online abstract indicating the discoveries would be announced at a conference that September. However, on July 27, 2005 a Spanish team led by José Luis Ortiz Moreno filed a claim with the Minor Planet Center for priority in discovering one of these dwarf planets. This forced Brown's hand in disclosing his team's other discoveries within days - much sooner than he had anticipated.

Apparently this stirred up a great controversy in the community and officially no name was associated with the discovery, although the Spanish team's telescope at Sierra Nevada Observatory was recognized as the location of the discovery. However, Brown was allowed to select the name Haumea for the dwarf planet.

Even though the Minor Planet Center accepted Moreno's submission, most reports seem to side with Brown. The main argument is no less than academic fraud on Moreno's part because he accessed public telescope logs and found some of Brown's data. It was as simple as Googling the identifier that Brown inserted in his public abstract.
If Moreno had hacked into a private computer from Brown's team I can understand fraud. But is it fraud to access public databases? We chemists do that all the time - reading abstracts from upcoming conferences to try to glean what our competitors are up to. That hasn't stopped anyone from submitting a paper or patent.
Secrecy only works if everyone competing follows the same rules. If there is a rule that planet discoveries must be made at conferences or by formal publication then this could not have happened. Moreno's submission to the Minor Planet Center should have been rejected if such a rule existed. If there is a rule that telescope logs should not be accessed then why make them public and indexed on Google?
Now there may exist field specific conventions. I don't know what they are in the case of discoveries such as these but here is an interesting quote from Michael Brown's Wikipedia page:

When asked about this online activity, Ortiz responded with an email to Brown that suggested Brown was at fault for "hiding objects," and said that "the only reason why we are now exchanging e-mail is because you did not report your object."[3] Brown says that this statement by Ortiz contradicts the accepted scientific practice of analyzing one's research until one is satisfied that it is accurate, then submitting it to peer review prior to any public announcement. However, the MPC only needs precise enough orbit determination on the object in order to provide discovery credit, and Ortiz et al. not only provided the orbit, but "precovery" images of the body in 1957 plates.

It seems to me that there is a clash of what are the conventions in the field. Certainly the Minor Planet Center did not recognize the convention of peer review before public disclosure. They only required sufficient proof for the discovery.

One way to look at this story is that Moreno acted more openly than Brown by disclosing information before peer review. This action forced Brown to disclose scientific results much more quickly than he had anticipated.

In a sense this is a type of Open Science Ratchet. The actions of scientists that are most open set the pace for everyone else working on that particular project, regardless of their views on how secretive science should be.

Imagine how the scenario would have played out if one of the groups had used an Open Notebook. On December 28, 2004 everyone with a stake in the search for planets would have had the opportunity to know that a very significant find had been made. There were still details to work out - and the Brown group might not be the first to do all the calculations to completely characterize the discovery. Certainly it would affect what other researchers did - even if they were completely opposed to the concept of Open Science.

Essentially secrecy in this context is an all-or-nothing gamble. Everyone is free to not disclose their work until after peer reviewed publication. In some cases the discoverer will get full credit for the discovery and the complete analysis. But in other cases another group working in parallel will publish first and leave nothing to claim.
As scientists become more open, it is likely that their ability to claim sole priority for all aspects of a discovery will be reduced. However, they will retain priority for the observations and calculations that they made first.
The more open the science, the faster it happens. And because of the Open Science Ratchet, a few Open Scientists scattered across various fields could have a larger hand than expected in speeding up science.

Chemistry World’s round-up of money and molecules

Big chemistry news this week was the announcement of the first synthetic cell, which could provide a basis for designing organisms from scratch.

Understandably the news has caused some controversy in the media, with sceptics concerned for the future of humanity and even research rivals worried that if the technology is patentable, other research groups will lose out on a piece of the pie.

The research could have enormous commercial value in the future for applications in biofuels and chemical synthesis through chemical biology and should be viewed as another step towards a greater understanding of science.

PHARMACEUTICALS

Cheap cancer drugs say Asda

So from creating synthetic cells to destroying cancerous ones…
In a world first supermarket Asda has announced that it will permanently sell privately prescribed cancer treatment drugs on a ‘not for profit’ basis in the UK, which could save patients thousands of pounds.

With a post code lottery on cancer funding dictating how much money is allocated to the treatment of each cancer patient, and the variation in cancer drugs available on the NHS depending on where you live, sufferers also have to deal with pharmacy mark-ups that can cripple patients’ finances.

Cancer affects nearly 300,000 people every year in the UK and the cost of treatment is too much for many sufferers. According to Asda, some privately prescribed cancer drugs are being sold with a 76 per cent mark-up in some high street stores.

This move will see prices of drugs like Iressa (gefitinib) – licensed to treat lung cancer – fall in Adsa stores to £2167.71 compared with other high street stores such as Superdrug that sell it for £3253.56.

Asda is urging patients to shop around when buying privately prescribed cancer drugs, claiming that 63 per cent of people were unaware that prices vary between pharmacies.

Asda has called for industry to follow its lead and end the high price mark-ups on cancer drugs and is working with suppliers to negotiate further discounts on trade prices of privately prescribed cancer drugs that it can then pass onto the customer.

Aspen to acquire Sigma Pharmaceuticals

African drug giant Aspen Pharmacare Holdings Ltd. has offered to buy leading Australian pharma firm Sigma Pharmaceuticals for A$1.49 billion (£850 million) in order to expand into Australia. The offer works out at A$0.60 per share and net debt of A$785 million.

The proposal is subject to conditions such as regulatory approval, and unanimous recommendation by the Sigma board – the company has confirmed that the approach has been made and is currently considering the offer.

Genzyme pays up

US firm Genzyme, the largest maker of genetic disease medicines, has agreed to pay $175 million (£121.5 million) in unlawful profits from the sale of products made at its Allston, Massachusetts, plant to the US federal government.

During an inspection in 2009, manufacturing quality at the Allston plant was found to be inadequate resulting in production delays, critical shortages of medically necessary products to consumers and drugs contaminated with metal, fibre, rubber and glass particles. These findings violated US Food and Drug Administration (FDA) regulations. Genzyme also suspended manufacturing of some of its products due to viral contamination in one of its bioreactors.

Genzyme has agreed to make improvements to its manufacturing processes at Allston, starting with an independent inspection of the plant that will recommend changes and result in an improvement plan subject to FDA approval. If the approved plan is not met, Genzyme will have to pay a substantial fine. In addition, Genzyme will have to move its vial filling operations to another plant or risk paying further disgorgement fines in the future.

INDUSTRY

Shin-Etsu’s new leadership

Japan’s most profitable chemical company, Shin-Etsu Chemical, has announced a change of leadership. Chihiro Kanagawa, former president, will become chairman, a position that has been vacant for over 15 years, and the former vice president, Shunzo Mori, will become president.

Kanagawa joined the firm in 1962, becoming president in 1990 and steering the company through some bold moves that have resulted in the company’s expansion over the years. Shunzo Mori is 74 and been at the company since 1963. He trained as a mechanical engineer and has worked his way up the company.

Shin-Etsu has extended profits and developed new areas of business, expanding its semiconductor silicon business by building on the strength of products such as silicone resins, synthetic quartz, rare earth magnets, cellulose derivatives and photoresists. PVC output has also increased and record earnings have been reported year on year.

The plans for the future include increasing sales in developing markets such as China and investing in improvements to accommodate environmental challenges.

Borouge and Linde Group get cracking

The Linde Group – a world leading gases and engineering company and Borouge – a leading provider of innovative plastic solutions – have signed a $1.1 billion contract confirming that Linde will build a 1.5 million tonnes per year (t/y) ethane cracker at Borouge’s production site in Ruwais, Abu Dhabi, in the UAE.

This deal comes hot on the heals of the inauguration of the world’s largest ethane cracker operated by Ras Laffan Olefins Co., that took place earlier this month.

The new cracker is the third of its kind to be built by Linde for Borouge in the last decade and will complement the existing crackers at the plant. Once the construction is complete, the Borouge site will be the largest ethane cracking complex in the world.

It signals a milestone in the growth of the company and is hoped to have a great impact on the automotive and advanced packaging markets in the Middle East and Asia.

SMEs pay less for chemicals

Small and medium enterprises (SMEs) in the chemicals sector are set to pay less in administrative charges following a decision by the European Commission. Small firms will pay less in fees to the European Chemicals Agency (ECHA) in connection with Classification, Labelling and Packaging Regulations (CLP) due to a reduction in levies.

The fees apply when a company asks for an alternative name for a substance or requests harmonised classification or labelling for substances.

Microenterprises will have a 90 per cent reduction in fees, small businesses will see a 60 per cent reduction, whilst medium size businesses will see a 30 per cent reduction and all companies that comply with CLP regulations will be able to work in their own language as the ECHA has now translated its guidance documents.

In addition to reduced fees SMEs will also be able to gain assistance with Registration, evaluation, authorisation and restriction of chemicals (Reach) regulations and CLP regulations.

And finally….

It seems that if you are a Chartered Chemical Engineer in the UK and Ireland, you can sit back and smile smugly. Results from the IChemE 2010 UK and Ireland Salary Survey reveal that the median salary for a Chartered Chemical Engineer is now £60,400 per year compared to £57,500 in 2008 even in this economic climate. Indeed a Chartered Chemical Engineer aged 30-39 will typically earn £8500 a year more than a non-chartered chemical engineer.

Is it time for a change in career we ask ourselves….

Mike Brown

Methanol Solubility Prediction Model 4 for Ugi reactions in the literature

Since non-aqueous solubility measurements have not become part of the standard characterization of organic compounds, it is not surprising that all the data we have for Ugi products originate from measurements that we made on our own compounds.

Since methanol is our most common solvent, Andrew Lang has collected the measurements that we have with values from the literature for a range of compounds, including our Ugi products, to generate a web service returning a predicted solubility based on a submitted SMILES string. The model (Model 4) was derived from a Random Forest algorithm, using molecular descriptors supplied by the CDK and VCC.

It would be nice to be able to test the model's ability to predict what will happen if a Ugi reaction is carried out in methanol. Although the actual solubility of Ugi products in the literature is typically not reported, reading the experimental sections in papers can still provide some validation of the model in some cases.

For example, consider the following Ugi products synthesized recently by Lezinska (Tetrahedron 2010)


Note that these images represent the azide group not following the octet rule. It is necessary to represent the structure SMILES without charges because the CDK and VCC web services used by the model do not process charges correctly. Stereochemistry also cannot be used and this can be removed from the SMILES simply by deleting slashes. Thus for the two molecules above the SMILES to be submitted to the prediction web service are:

O=C(NC1CCCCC1)C(Cc2ccc(C)cc2)N(c4ccccc4C(=O)c3ccccc3)C(=O)C(Cc5ccccc5)N=N#N
AND
O=C(NC1CCCCC1)C(C(=O)c2ccccc2)N(Cc3ccc(C)cc3)C(=O)C(C)CCN=N#N

The predicted methanol solubilities are respectively 0.004 M and 0.03 M.

Now if we look at the details in the experimental section, both of these Ugi products were synthesized in methanol at a limiting reactant concentration of about 0.1 M. Even though this is much more dilute than the usual 0.5-2.0 M generally recommended for Ugi reactions (Domling 2000), the products still precipitate and can be filtered off. This is consistent with the predicted solubilities above and the model would have suggested ahead of time that methanol might be a good solvent for isolation of the products by precipitation.

So far these are just anecdotal results but it does illustrate that solubility models can be evaluated without explicit determination of solubility in the literature.

ASMS: Forget Vioxx, eat chocolate?

After sitting through a number of incredibly technical presentations today at ASMS I came across a fantastic poster presented by Shunyan Mo of the University of Illinois College of Pharmacy, US. Using an ultrafiltration LC-MS (liquid chromatography – mass spectrometry) assay, Mo and co-workers have shown that certain flavinoids found in cocoa selectively inhibit the cyclooxygenase-2 (Cox-2) enzyme and therefore could have anti-inflammatory effects.

As discussed in this Chemistry World article, Cox inhibitors such as naproxen play a vital role in the treatment of pain and inflammation, but they do have some side effects. To reduce these side effects, a number of pharmaceutical companies developed selective Cox-2 inhibitors, but unfortunately many of these were linked to an increased risk of blood clotting, heart attack and stroke. In 2004, those risks caused a huge embarrassment for Merck & Co., after it was forced to withdraw its blockbuster Cox-2 inhibitor Vioxx (rofecoxib) costing the company in the region of $4.75 billion (£3.3 billion) in legal settlements on top of the billions of dollars of lost sales.

But now, Mo has shown that eating chocolate might help reduce inflammation, and has identified using MS-MS experiments that two oxidation products of the abundant cocoa fatty acid, linoleic acid,  9-hydroxy-10,12-octadecadienoic acid (9-HODE) and 13-hydroxy-9,11-octadecadienoic acid (13-HODE) strongly and selectively inhibit Cox-2.

Perhaps unsurprisingly, the research was funded by US-confectionery company Hershey and the US National Institutes of Health (NIH).

So next time you need to reach for an anti-inflammatory, it might be worth reaching for a bar of chocolate instead – just don’t blame me if you put on a few pounds!

Matt Wilkinson

Smoking could be good for you – if you get the message

Fancy a smoke? No, it’s my last one and I need to get an urgent message to HQ…

Sadly, this line is yet to appear in a spy film, but thanks to George Whitesides and his group at Harvard University, US, it might one day. The group has had another stab at ‘infochemistry’ – using chemical means to convey a message or information without the need for an electrical power supply.

Avid readers of this blog will remember that in June of last year the group first mooted the idea of using ‘infofuses’ soaked in alkali metal solutions to transmit coloured light messages as they burned, and then the follow-up using a microfluidic device with a series of droplets passing by windows in the device to let light through – using intensity, colour and polarisation to encode more information than standard on-off digital signals.

This time, the team have developed their ‘infofuse’ idea. One of the major drawbacks of the original system was the fact that the fuses tended to go out if they were in contact with a surface, and also burned really fast – to keep a message like an SOS call or suchlike repeating for 24 hours would need 2.5km of fuse.

The answers sound simple and almost obvious – use a slower burning fuse and keep most of it lifted off the surface. But it’s never quite as easy as all that. Keeping the fuses off the surface was quite simple – crimping them into a tent-like shape held enough of the nitrocellulose far enough away from whatever surface the fuse was resting on to stop it sinking all the heat and putting out the flame.

But the timing problem required a more considered approach – simply using a slow burning fuse was no good – it would take hours to transmit the message, and most slow burning materials don’t burn hot enough to stimulate thermal emission of the alkali metal ions. What was needed was a combination – a slow burning ‘master’ fuse, with a series of fast ‘slave’ fuses sticking out of it. As the master fuse smoulders up to each slave fuse, it ignites and rapidly transmits its message.

This gives a compact system that can repeat a single, fast message over a long period, or transmit several different messages one after the other. The slow fuse is made from cotton soaked in sodium nitrate – similar to the ‘slow match’ used to ignite gunpowder charges in early matchlock firearms. However, the team showed that one could equally use a cigarette as the slow match – much less conspicuous if you’re an undercover agent…

Phillip Broadwith

Reference: C Kim, S W Thomas III and G M Whitesides, Angew. Chem. Int. Ed., 2010, DOI:
10.1002/anie.201001582

The Scientist Article on Electronic Lab Notebooks

Amber Dance has written an article in The Scientist (2010-05-01) Digital Upgrade: How to choose your lab’s next electronic lab notebook. This is basically a quick overview of different Electronic Lab Notebooks (ELNs) that should be helpful for people researching what is currently available in that space.

There was some coverage of Open Notebook Science and Steve Koch and I were quoted. Ironically my contribution appeared in the "Cons" section 🙂

Pros

  • The format is unconstrained—you can set up any categories, and as many users and pages, as you want—and fast to set up.
  • Open notebooking attracts collaborators. Koch counts three collaborations that wouldn’t have happened if he weren’t on OpenWetWare. And his students build professional networks well before they author a paper.

Cons

  • Wikis were not designed with scientific data in mind. For example, it’s hard to make a table, Koch says.
  • Open notebook science “does limit where you can send your work,” says Jean-Claude Bradley, a chemist at Drexel University in Philadelphia, who also uses an open wiki notebook. His lab sticks to journals that accept preprints.
  • Posting online voids international patent rights, although US patents are still possible.

In my opinion, one of the biggest "Pros" wasn't listed in that section: the free cost. (That was mentioned elsewhere though) When you see the costs of some of these other commercial systems, that has to be a factor for many people trying to make a decision.

If privacy is an issue wikis can certainly be made private, although I'm not sure if that is possible on OpenWetWare. It can be done for $5/month on Wikispaces, the wiki we use for lab notebooks - although then it wouldn't be Open Notebook Science.

Concerning Steve's Con of wikis being difficult to use to store data, that is true. However combining the use of a wiki with Google Spreadsheets has completely resolved that issue for us. With our ability to automatically export an archive of the notebook (as HTML) and spreadsheets (as XLS) into an integrated archive, the two platforms operate essentially as if they were a single system.

OpenSciNY Open Notebook Science Talk

On May 14, 2010 I presented on Open Notebook Science at the OpenSciNY conference at the New York University Bobst Library. I introduced the topic by telling a few stories about how new forms of communication are affecting how we think about concepts like "scientific precedent", "peer review", "scientific publishing" and "scientific scholarship". At the end I spoke about archiving Open Notebook Science projects and showed the physical copies of both the Reaction Attempts and ONS Solubility Challenge books.

Margaret Smith did a wonderful job of organizing the conference with a very interesting line-up of speakers: Heather Joseph, Antony Williams, Elizabeth Brown and David Hogg. We formed break-out sessions at the end to discuss with the attendees concepts around Open Science. I was part of the session on Promoting Open Science.

The tone at this and other similar conferences I have attended recently is probably best described as cautiously optimistic and focused on what is possible. The Open Science movement - at least as far as it is reflected by the people I know - does not seem to be driven by zealots or idealists trying to get everyone to drink the cool-aid. It is just a bunch of people who see opportunities to do things in better ways as new tools become available - and they can't find a credible reason not to do them.

Check here on FriendFeed for updates about links to recordings, slides, etc.

My presentation below:

Setac Europe 2010: ‘It’ll all come out in the wash’

Anyone else recognise this saying? My parents used it a lot while I was growing up when I’d taken a course of action that, while not ideal, wasn’t going to cause any lasting damage.

In the case of silver nanoparticles in textiles, however, it seems it probably will come out in the wash – disappear down our domestic waste pipes and into our environment, with no guarantee that lasting damage won’t be done.

It has been predicted that 12-49 per cent of the silver nanoparticles produced globally end up in textiles, as antimicrobials in socks for example. And in a first step towards figuring out whether this practice poses an environmental risk, Bernd Nowack and his team at EMPA in Switzerland have assessed whether or not these particles remain embedded in the textiles when they are washed in a washing machine.

Their key finding was that different textiles behave very differently, some release 20 per cent of their silver particles in the first wash after purchase where as others release hardly anything. The conclusion the team has drawn from this is that how the manufacturers have embedded the particles is very important. ‘Companies have possibilities to design safe nanotextiles that release only small amounts of silver,’ said Nowack.

Other, more predictable, findings include that less particles are released the second time the item of clothing is washed and that the mechanical stress of the washing machine aids their release.

As well as trying to get textile companies to change their ways, the team also plan to consider both the environmental fate and toxicology of the released particles.

Until they do, maybe I should be thinking about more than my nose before buying these sweet-smelling socks next time.

To learn more: the work was published in the journal Environmental Science and Technology in September last year, and was well covered by the press at the time (see here, here and here).

Nina Notman