ANYmal robot excels in parkour feats thanks to neural network training – Interesting Engineering

The agility of ANYmal, a dog-like robot, has been enhanced with a new framework, enabling it to complete a basic parkour course at up to 6 feet (2 meters) per second. Parkour, the urban sport of navigating obstacles with athleticism, is gaining widespread popularity.

The modified learning approach, which emphasizes crawling, jumping, climbing, and crouching, may enable the robot to crawl under and vault over physical obstacles while conducting search and rescue operations.

Read the original:

ANYmal robot excels in parkour feats thanks to neural network training - Interesting Engineering

Application of artificial neural network and dynamic adsorption … – Nature.com

Nanda, S. & Berruti, F. Municipal solid waste management and landfilling technologies: A review. Environ. Chem. Lett. 19, 14331456 (2021).

Article CAS Google Scholar

Sharma, P. & Kumar, S. Characterization and phytotoxicity assessment of organic pollutants in old and fresh municipal solid wastes at open dump site: A case study. Environ. Technol. Innov. 24, 101938 (2021).

Article CAS Google Scholar

Puntillo, P., Gulluscio, C., Huisingh, D. & Veltri, S. Reevaluating waste as a resource under a circular economy approach from a system perspective: Findings from a case study. Bus. Strateg. Environ. 30, 968984 (2021).

Article Google Scholar

Cheng, S. Y. et al. Landfill leachate wastewater treatment to facilitate resource recovery by a coagulation-flocculation process via hydrogen bond. Chemosphere 262, 127829 (2021).

Article ADS CAS PubMed Google Scholar

Iskander, S. M., Brazil, B., Novak, J. T. & He, Z. Resource recovery from landfill leachate using bioelectrochemical systems: Opportunities, challenges, and perspectives. Biores. Technol. 201, 347354 (2016).

Article CAS Google Scholar

Kurniawan, T. A. et al. Resource recovery from landfill leachate: An experimental investigation and perspectives. Chemosphere 274, 129986 (2021).

Article ADS CAS PubMed Google Scholar

Kurniawan, T. A. et al. Resource recovery toward sustainability through nutrient removal from landfill leachate. J. Environ. Manage. 287, 112265 (2021).

Article CAS PubMed Google Scholar

Ye, W. et al. Sustainable management of landfill leachate concentrate through recovering humic substance as liquid fertilizer by loose nanofiltration. Water Res. 157, 555563 (2019).

Article CAS PubMed Google Scholar

Gu, N., Liu, J., Ye, J., Chang, N. & Li, Y.-Y. Bioenergy, ammonia and humic substances recovery from municipal solid waste leachate: A review and process integration. Biores. Technol. 293, 122159 (2019).

Article CAS Google Scholar

Lin, J. et al. Sustainable management of landfill leachate concentrate via nanofiltration enhanced by one-step rapid assembly of metal-organic coordination complexes. Water Res. 204, 117633 (2021).

Article CAS PubMed Google Scholar

Guo, X.-X., Liu, H.-T. & Wu, S.-B. Humic substances developed during organic waste composting: Formation mechanisms, structural properties, and agronomic functions. Sci. Total Environ. 662, 501510 (2019).

Article ADS CAS PubMed Google Scholar

omiska-Patek, D. & Anielak, A. M. Quantitative balance and analysis of fulvic acids changes in the process of municipal sewage treatment. Water Resour. Ind. 26, 100155 (2021).

Article Google Scholar

Nguyen, H.V.-M., Lee, H.-S., Lee, S.-Y., Hur, J. & Shin, H.-S. Changes in structural characteristics of humic and fulvic acids under chlorination and their association with trihalomethanes and haloacetic acids formation. Sci. Total Environ. 790, 148142 (2021).

Article ADS CAS PubMed Google Scholar

Meng, F., Huang, Q., Yuan, G., Cai, Y. & Han, F. X. New Trends in Removal of Heavy Metals from Industrial Wastewater 131160 (Elsevier, 2021).

Book Google Scholar

Xu, P. et al. The broad application and mechanism of humic acids for treating environmental pollutants: Insights from bibliometric analysis. J. Clean. Prod. 337, 130510 (2022).

Article CAS Google Scholar

Lipczynska-Kochany, E. Humic substances, their microbial interactions and effects on biological transformations of organic pollutants in water and soil: A review. Chemosphere 202, 420437 (2018).

Article ADS CAS PubMed Google Scholar

Piccolo, A., Spaccini, R., De Martino, A., Scognamiglio, F. & di Meo, V. Soil washing with solutions of humic substances from manure compost removes heavy metal contaminants as a function of humic molecular composition. Chemosphere 225, 150156 (2019).

Article ADS CAS PubMed Google Scholar

Kleber, M. & Lehmann, J. Humic substances extracted by alkali are invalid proxies for the dynamics and functions of organic matter in terrestrial and aquatic ecosystems. J. Environ. Qual. 48, 207216 (2019).

Article CAS PubMed Google Scholar

Mahler, C. F., Dal Santo Svierzoski, N. & Bernardino, C. A. R. Chemical characteristics of humic substances in nature. in Humic Substance (2021).

Baccot, C., Pallier, V., Thom, M. T., Thuret-Benoist, H. & Feuillade-Cathalifaud, G. Valorization of extracted organic matter from municipal solid waste leachate: Application to soils from France and Togo. Waste Manage. 102, 161169 (2020).

Article CAS Google Scholar

Vithanage, M., Wijesekara, H. & Mayakaduwa, S. Isolation, purification and analysis of dissolved organic carbon from Gohagoda uncontrolled open dumpsite leachate, Sri Lanka. Environ. Technol. 38, 16101618 (2017).

Article CAS PubMed Google Scholar

Chen, C. et al. Dynamic adsorption models and artificial neural network prediction of mercury adsorption by a dendrimer-grafted polyacrylonitrile fiber in fixed-bed column. J. Clean. Prod. 310, 127511 (2021).

Article CAS Google Scholar

Naddafi, K., Rastkari, N., Nabizadeh, R., Saeedi, R. & Gholami, M. Removal of 2, 4, 6-trichlorophenol from aqueous solutions by cetylpyridinium bromide (CPB)-modified zeolite in batch and continuous systems. Desalin. Water Treat. 86, 131138 (2017).

Article CAS Google Scholar

Chu, K. H. Breakthrough curve analysis by simplistic models of fixed bed adsorption: In defense of the century-old Bohart-Adams model. Chem. Eng. J. 380, 122513 (2020).

Article CAS Google Scholar

Hethnawi, A., Manasrah, A. D., Vitale, G. & Nassar, N. N. Fixed-bed column studies of total organic carbon removal from industrial wastewater by use of diatomite decorated with polyethylenimine-functionalized pyroxene nanoparticles. J. Colloid Interface Sci. 513, 2842 (2018).

Article ADS CAS PubMed Google Scholar

Chen, S., Bai, S., Ya, R., Du, C. & Ding, W. Continuous silicic acid removal in a fixed-bed column using a modified resin: Experiment investigation and artificial neural network modeling. J. Water Process Eng. 49, 102937 (2022).

Article Google Scholar

Juela, D., Vera, M., Cruzat, C., Astudillo, A. & Vanegas, E. A new approach for scaling up fixed-bed adsorption columns for aqueous systems: A case of antibiotic removal on natural adsorbent. Process Saf. Environ. Prot. 159, 953963 (2022).

Article CAS Google Scholar

Chowdhury, S. & Saha, P. D. Artificial neural network (ANN) modeling of adsorption of methylene blue by NaOH-modified rice husk in a fixed-bed column system. Environ. Sci. Pollut. Res. 20, 10501058 (2013).

Article CAS Google Scholar

Raguraj, S., Kasim, S., Jaafar, N. M., Nazli, M. H. & Amali, R. K. A. A comparative study of tea waste derived humic-like substances with lignite-derived humic substances on chemical composition, spectroscopic properties and biological activity. Environ. Sci. Pollut. Res. 29, 6063160640 (2022).

Article CAS Google Scholar

Mallick, S. P. Method Development for Aquatic Humic Substance Isolation and Its Application to Landfill Leachate (Lamar University-Beaumont, 2017).

Google Scholar

Chen, C.-Y., Li, W.-T. & Pan, S.-Y. Performance evaluation of cascade separation for a humic substance and nutrient recovery from piggery wastewater toward a circular bioeconomy. ACS Sustain. Chem. Eng. 9, 81158124 (2021).

Article CAS Google Scholar

Liu, H., Li, Y., Yang, H., Siddique, M. S. & Yu, W. The characters of dissolved organic matters from litter-mimic with the different humification states and their effects on drinking water treatment processes. Sci. Total Environ. 861, 160470 (2023).

Article ADS CAS PubMed Google Scholar

Hughes, D., Holliman, P., Jones, T., Butler, A. J. & Freeman, C. Rapid, semi-automated fractionation of freshwater dissolved organic carbon using DAX 8 (XAD 8) and XAD 4 resins in Tandem. Nat. Sci. 8, 487498 (2016).

CAS Google Scholar

Aslam, M. M. A., Den, W. & Kuo, H.-W. Removal of hexavalent chromium by encapsulated chitosan-modified magnetic carbon nanotubes: Fixed-bed column study and modelling. J. Water Process Eng. 42, 102143 (2021).

Article Google Scholar

Bai, S., Li, J., Ding, W., Chen, S. & Ya, R. Removal of boron by a modified resin in fixed bed column: Breakthrough curve analysis using dynamic adsorption models and artificial neural network model. Chemosphere 296, 134021 (2022).

Article ADS CAS PubMed Google Scholar

Hu, A. et al. Phosphate recovery with granular acid-activated neutralized red mud: Fixed-bed column performance and breakthrough curve modelling. J. Environ. Sci. 90, 7886 (2020).

Article CAS Google Scholar

Feizi, F., Sarmah, A. K. & Rangsivek, R. Adsorption of pharmaceuticals in a fixed-bed column using tyre-based activated carbon: Experimental investigations and numerical modelling. J. Hazard. Mater. 417, 126010 (2021).

Article CAS PubMed Google Scholar

Dalhat, M., Muazu, N. D. & Essa, M. H. Generalized decay and artificial neural network models for fixed-Bed phenolic compounds adsorption onto activated date palm biochar. J. Environ. Chem. Eng. 9, 104711 (2021).

Article CAS Google Scholar

Han, R., Wang, Y., Zou, W., Wang, Y. & Shi, J. Comparison of linear and nonlinear analysis in estimating the Thomas model parameters for methylene blue adsorption onto natural zeolite in fixed-bed column. J. Hazard. Mater. 145, 331335 (2007).

Article CAS PubMed Google Scholar

Callery, O. & Healy, M. G. Predicting the propagation of concentration and saturation fronts in fixed-bed filters. Water Res. 123, 556568 (2017).

Article CAS PubMed Google Scholar

Hosseinzadeh, A. et al. Application of artificial neural network and multiple linear regression in modeling nutrient recovery in vermicompost under different conditions. Biores. Technol. 303, 122926 (2020).

Article CAS Google Scholar

Nuzzo, A., Buurman, P., Cozzolino, V., Spaccini, R. & Piccolo, A. Infrared spectra of soil organic matter under a primary vegetation sequence. Chem. Biol. Technol. Agric. 7, 112 (2020).

Article Google Scholar

Guzeva, A. Geochemical features of humic acids extracted from sediments of urban lakes of the Arctic. Environ. Monit. Assess. 194, 749 (2022).

Article CAS PubMed Google Scholar

Chen, H., Li, Q., Wang, M., Ji, D. & Tan, W. XPS and two-dimensional FTIR correlation analysis on the binding characteristics of humic acid onto kaolinite surface. Sci. Total Environ. 724, 138154 (2020).

Article ADS CAS PubMed Google Scholar

Verrillo, M. et al. Antiflammatory activity and potential dermatological applications of characterized humic acids from a lignite and a green compost. Sci. Rep. 12, 2152 (2022).

Article ADS CAS PubMed PubMed Central Google Scholar

Savy, D., Cozzolino, V., Vinci, G., Nebbioso, A. & Piccolo, A. Water-soluble lignins from different bioenergy crops stimulate the early development of maize (Zea mays, L.). Molecules 20, 1995819970 (2015).

Article CAS PubMed PubMed Central Google Scholar

Santos, J. et al. High-value compounds obtained from grape canes (Vitis vinifera L.) by steam pressure alkali extraction. Food Bioprod. Process. 133, 153167 (2022).

Article CAS Google Scholar

Monda, H., Cozzolino, V., Vinci, G., Spaccini, R. & Piccolo, A. Molecular characteristics of water-extractable organic matter from different composted biomasses and their effects on seed germination and early growth of maize. Sci. Total Environ. 590, 4049 (2017).

Article ADS PubMed Google Scholar

Talat, M. et al. Effective removal of fluoride from water by coconut husk activated carbon in fixed bed column: Experimental and breakthrough curves analysis. Groundw. Sustain. Dev. 7, 4855 (2018).

Article Google Scholar

Alardhi, S. M., Albayati, T. M. & Alrubaye, J. M. Adsorption of the methyl green dye pollutant from aqueous solution using mesoporous materials MCM-41 in a fixed-bed column. Heliyon 6, e03253 (2020).

Article PubMed PubMed Central Google Scholar

Ansari, A., Nadres, E. T., , M. & Rodrigues, D. F. Investigation of the removal and recovery of nitrate by an amine-enriched composite under different fixed-bed column conditions. Process Saf. Environ. Prot. 150, 365372 (2021).

See the original post:

Application of artificial neural network and dynamic adsorption ... - Nature.com

Scientific discovery in the age of artificial intelligence – Nature.com

LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436444 (2015). This survey summarizes key elements of deep learning and its development in speech recognition, computer vision and and natural language processing.

Article ADS CAS PubMed Google Scholar

de Regt, H. W. Understanding, values, and the aims of science. Phil. Sci. 87, 921932 (2020).

Article MathSciNet Google Scholar

Pickstone, J. V. Ways of Knowing: A New History of Science, Technology, and Medicine (Univ. Chicago Press, 2001).

Han, J. et al. Deep potential: a general representation of a many-body potential energy surface. Commun. Comput. Phys. 23, 629639 (2018). This paper introduced a deep neural network architecture that learns the potential energy surface of many-body systems while respecting the underlying symmetries of the system by incorporating group theory.

Akiyama, K. et al. First M87 Event Horizon Telescope results. IV. Imaging the central supermassive black hole. Astrophys. J. Lett. 875, L4 (2019).

Article ADS CAS Google Scholar

Wagner, A. Z. Constructions in combinatorics via neural networks. Preprint at https://arxiv.org/abs/2104.14516 (2021).

Coley, C. W. et al. A robotic platform for flow synthesis of organic compounds informed by AI planning. Science 365, eaax1566 (2019).

Article CAS PubMed Google Scholar

Bommasani, R. et al. On the opportunities and risks of foundation models. Preprint at https://arxiv.org/abs/2108.07258 (2021).

Davies, A. et al. Advancing mathematics by guiding human intuition with AI. Nature 600, 7074 (2021). This paper explores how AI can aid the development of pure mathematics by guiding mathematical intuition.

Article ADS CAS PubMed PubMed Central MATH Google Scholar

Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583589 (2021).This study was the first to demonstrate the ability to predict protein folding structures using AI methods with a high degree of accuracy, achieving results that are at or near the experimental resolution. This accomplishment is particularly noteworthy, as predicting protein folding has been a grand challenge in the field of molecular biology for over 50 years.

Article ADS CAS PubMed PubMed Central Google Scholar

Stokes, J. M. et al. A deep learning approach to antibiotic discovery. Cell 180, 688702 (2020).

Article CAS PubMed PubMed Central Google Scholar

Bohacek, R. S., McMartin, C. & Guida, W. C. The art and practice of structure-based drug design: a molecular modeling perspective. Med. Res. Rev. 16, 350 (1996).

Article CAS PubMed Google Scholar

Bileschi, M. L. et al. Using deep learning to annotate the protein universe. Nat. Biotechnol. 40, 932937 (2022).

Bellemare, M. G. et al. Autonomous navigation of stratospheric balloons using reinforcement learning. Nature 588, 7782 (2020). This paper describes a reinforcement-learning algorithm for navigating a super-pressure balloon in the stratosphere, making real-time decisions in the changing environment.

Article ADS CAS PubMed Google Scholar

Tshitoyan, V. et al. Unsupervised word embeddings capture latent knowledge from materials science literature. Nature 571, 9598 (2019).

Article ADS CAS PubMed Google Scholar

Zhang, L. et al. Deep potential molecular dynamics: a scalable model with the accuracy of quantum mechanics. Phys. Rev. Lett. 120, 143001 (2018).

Article ADS CAS PubMed Google Scholar

Deiana, A. M. et al. Applications and techniques for fast machine learning in science. Front. Big Data 5, 787421 (2022).

Karagiorgi, G. et al. Machine learning in the search for new fundamental physics. Nat. Rev. Phys. 4, 399412 (2022).

Zhou, C. & Paffenroth, R. C. Anomaly detection with robust deep autoencoders. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 665674 (2017).

Hinton, G. E. & Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science 313, 504507 (2006).

Article ADS MathSciNet CAS PubMed MATH Google Scholar

Kasieczka, G. et al. The LHC Olympics 2020 a community challenge for anomaly detection in high energy physics. Rep. Prog. Phys. 84, 124201 (2021).

Article ADS CAS Google Scholar

Govorkova, E. et al. Autoencoders on field-programmable gate arrays for real-time, unsupervised new physics detection at 40 MHz at the Large Hadron Collider. Nat. Mach. Intell. 4, 154161 (2022).

Article Google Scholar

Chamberland, M. et al. Detecting microstructural deviations in individuals with deep diffusion MRI tractometry. Nat. Comput. Sci. 1, 598606 (2021).

Article PubMed PubMed Central Google Scholar

Rafique, M. et al. Delegated regressor, a robust approach for automated anomaly detection in the soil radon time series data. Sci. Rep. 10, 3004 (2020).

Article ADS CAS PubMed PubMed Central Google Scholar

Pastore, V. P. et al. Annotation-free learning of plankton for classification and anomaly detection. Sci. Rep. 10, 12142 (2020).

Article ADS CAS PubMed PubMed Central Google Scholar

Naul, B. et al. A recurrent neural network for classification of unevenly sampled variable stars. Nat. Astron. 2, 151155 (2018).

Article ADS Google Scholar

Lee, D.-H. et al. Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In ICML Workshop on Challenges in Representation Learning (2013).

Zhou, D. et al. Learning with local and global consistency. In Advances in Neural Information Processing Systems 16, 321328 (2003).

Radivojac, P. et al. A large-scale evaluation of computational protein function prediction. Nat. Methods 10, 221227 (2013).

Article CAS PubMed PubMed Central Google Scholar

Barkas, N. et al. Joint analysis of heterogeneous single-cell RNA-seq dataset collections. Nat. Methods 16, 695698 (2019).

Article CAS PubMed PubMed Central Google Scholar

Tran, K. & Ulissi, Z. W. Active learning across intermetallics to guide discovery of electrocatalysts for CO2 reduction and H2 evolution. Nat. Catal. 1, 696703 (2018).

Article CAS Google Scholar

Jablonka, K. M. et al. Bias free multiobjective active learning for materials design and discovery. Nat. Commun. 12, 2312 (2021).

Article ADS CAS PubMed PubMed Central Google Scholar

Roussel, R. et al. Turn-key constrained parameter space exploration for particle accelerators using Bayesian active learning. Nat. Commun. 12, 5612 (2021).

Article ADS CAS PubMed PubMed Central Google Scholar

Ratner, A. J. et al. Data programming: creating large training sets, quickly. In Advances in Neural Information Processing Systems 29, 35673575 (2016).

Ratner, A. et al. Snorkel: rapid training data creation with weak supervision. In International Conference on Very Large Data Bases 11, 269282 (2017). This paper presents a weakly-supervised AI system designed to annotate massive amounts of data using labeling functions.

Butter, A. et al. GANplifying event samples. SciPost Phys. 10, 139 (2021).

Article ADS Google Scholar

Brown, T. et al. Language models are few-shot learners. In Advances in Neural Information Processing Systems 33, 18771901 (2020).

Ramesh, A. et al. Zero-shot text-to-image generation. In International Conference on Machine Learning 139, 88218831 (2021).

Littman, M. L. Reinforcement learning improves behaviour from evaluative feedback. Nature 521, 445451 (2015).

Article ADS CAS PubMed Google Scholar

Cubuk, E. D. et al. Autoaugment: learning augmentation strategies from data. In IEEE Conference on Computer Vision and Pattern Recognition 113123 (2019).

Reed, C. J. et al. Selfaugment: automatic augmentation policies for self-supervised learning. In IEEE Conference on Computer Vision and Pattern Recognition 26742683 (2021).

ATLAS Collaboration et al. Deep generative models for fast photon shower simulation in ATLAS. Preprint at https://arxiv.org/abs/2210.06204 (2022).

Mahmood, F. et al. Deep adversarial training for multi-organ nuclei segmentation in histopathology images. IEEE Trans. Med. Imaging 39, 32573267 (2019).

Article Google Scholar

Teixeira, B. et al. Generating synthetic X-ray images of a person from the surface geometry. In IEEE Conference on Computer Vision and Pattern Recognition 90599067 (2018).

Lee, D., Moon, W.-J. & Ye, J. C. Assessing the importance of magnetic resonance contrasts using collaborative generative adversarial networks. Nat. Mach. Intell. 2, 3442 (2020).

Article Google Scholar

Kench, S. & Cooper, S. J. Generating three-dimensional structures from a two-dimensional slice with generative adversarial network-based dimensionality expansion. Nat. Mach. Intell. 3, 299305 (2021).

Article Google Scholar

Wan, C. & Jones, D. T. Protein function prediction is improved by creating synthetic feature samples with generative adversarial networks. Nat. Mach. Intell. 2, 540550 (2020).

Article Google Scholar

Repecka, D. et al. Expanding functional protein sequence spaces using generative adversarial networks. Nat. Mach. Intell. 3, 324333 (2021).

Article Google Scholar

Marouf, M. et al. Realistic in silico generation and augmentation of single-cell RNA-seq data using generative adversarial networks. Nat. Commun. 11, 166 (2020).

Article ADS CAS PubMed PubMed Central Google Scholar

Ghahramani, Z. Probabilistic machine learning and artificial intelligence. Nature 521, 452459 (2015).This survey provides an introduction to probabilistic machine learning, which involves the representation and manipulation of uncertainty in models and predictions, playing a central role in scientific data analysis.

Article ADS CAS PubMed Google Scholar

Cogan, J. et al. Jet-images: computer vision inspired techniques for jet tagging. J. High Energy Phys. 2015, 118 (2015).

Article Google Scholar

Zhao, W. et al. Sparse deconvolution improves the resolution of live-cell super-resolution fluorescence microscopy. Nat. Biotechnol. 40, 606617 (2022).

Article CAS PubMed Google Scholar

Brbi, M. et al. MARS: discovering novel cell types across heterogeneous single-cell experiments. Nat. Methods 17, 12001206 (2020).

Article PubMed Google Scholar

Qiao, C. et al. Evaluation and development of deep neural networks for image super-resolution in optical microscopy. Nat. Methods 18, 194202 (2021).

Article CAS PubMed Google Scholar

Andreassen, A. et al. OmniFold: a method to simultaneously unfold all observables. Phys. Rev. Lett. 124, 182001 (2020).

Article ADS CAS PubMed Google Scholar

Bergenstrhle, L. et al. Super-resolved spatial transcriptomics by deep data fusion. Nat. Biotechnol. 40, 476479 (2021).

Vincent, P. et al. Extracting and composing robust features with denoising autoencoders. In International Conference on Machine Learning 10961103 (2008).

Kingma, D. P. & Welling, M. Auto-encoding variational Bayes. In International Conference on Learning Representations (2014).

Eraslan, G. et al. Single-cell RNA-seq denoising using a deep count autoencoder. Nat. Commun. 10, 390 (2019).

Article ADS CAS PubMed PubMed Central Google Scholar

Goodfellow, I., Bengio, Y. & Courville, A. Deep Learning (MIT Press, 2016).

Olshausen, B. A. & Field, D. J. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607609 (1996).

Article ADS CAS PubMed Google Scholar

Visit link:

Scientific discovery in the age of artificial intelligence - Nature.com

What is AI Pruning? Definition from Techopedia.com – Techopedia

What Does AI Pruning Mean?

AI pruning, also known as neural network pruning, is a collection of strategies for editing a neural network to make it as lean as possible. The editing process involves removing unnecessary parameters, artificial neurons, weights, or deep learning network layers.

The goal is to improve network efficiency without significantly impacting the accuracy of a machine learning models accuracy.

A deep neural network can contain millions or even billions of parameters and hyperparameters that are used to fine-tune a models performance during the training phase. Many of them wont be used again very often or even at all once the trained model has been deployed.

If done right, pruning can:

To improve efficiency without significant loss of accuracy, pruning is often used in combination with two other optimization techniques: quantization and knowledge distillation. Both of these compression techniques use reduced precision to improve efficiency.

Pruning can be particularly valuable for deploying large artificial intelligence (AI) and machine learning (ML) models on resource-constrained devices like smartphones or Internet of Things (IoT) devices at the edge of the network.

Pruning can address these challenges by:

Pruning has become an important strategy for ensuring ML models and algorithms are both efficient and effective at the edge of the network, closer to where data is generated and where quick decisions are needed.

The problem is that pruning is a balancing act. While the ultimate goal is to reduce the size of a neural network model, pruning can not create a significant loss in performance. A model that is pruned too heavily can require extensive retraining, and a model that is pruned too lightly can be more expensive to maintain and operate.

One of the biggest challenges is determining when to prune.Iterative pruning takes place multiple times during the training process. After each pruning iteration, the network is fine-tuned to recover any lost accuracy, and the process is repeated until the desired level of sparsity (reduction in parameters) is achieved. In contrast, one-shot pruning is done all at once, typically after the network has been fully trained.

Which approach is better can depend on the specific network architecture, the target deployment environment, and the models use cases.

If model accuracy is of utmost importance, and there are sufficient computational resources and time for training, iterative pruning is likely to be more effective. On the other hand, one-shot pruning is quicker and can often reduce the model size and inference time to an acceptable level without the need for multiple iterations.

In practice, using a combination of both techniques and a more advanced pruning strategy like magnitude-based structured pruning can help achieve the best balance between model efficiency and optimal outputs.

Magnitude-based pruning is one of the most common advanced AI pruning strategies. It involves removing less important or redundant connections (weights) between neurons in a neural network.

View original post here:

What is AI Pruning? Definition from Techopedia.com - Techopedia

Neural Network Software Market 2023 Growth Factors and Industry … – University City Review

The Global Neural Network Software Market research report provides all the information related to the industry. It gives the markets outlook by giving authentic data to its client which helps to make essential decisions. It gives an overview of the market which includes its definition, applications and developments, and manufacturing technology. This Neural Network Software market research report tracks all the recent developments and innovations in the market. It gives the data regarding the obstacles while establishing the business and guides to overcome the upcoming challenges and obstacles.

Get A Free PDF Sample Copy of Report:

https://www.marketinsightsreports.com/reports/031012040629/global-neural-network-software-market-growth-status-and-outlook-2023-2029/inquiry?Mode=Naruto

Significant Players Covered in the Neural Network Software Market Report:

GMDH, Artificial Intelligence Techniques, Oracle, IBM, Microsoft, Intel, AWS, NVIDIA, TFLearn, Keras

Market Segmentation: By Type

Analysis Software

Optimization Software

Visual Software

Market Segmentation: By Application

Small and Medium Enterprises (SMEs)

Large enterprises

Regional Analysis for Neural Network Software Market:

North American Market (USA, Canada, North America, Mexico), European Market (Germany, France, UK, Russia, Italy), Asia Pacific Market (China, Japan, South Korea, Asian Countries, India, Southeast Asia), South American Market (Brazil, Argentina), Colombia, etc.), Middle East and Africa Market (Saudi Peninsula, UAE, Egypt, Nigeria, South Africa)

Table of Contents:

Chapter 1 Neural Network Software Market Overview

Chapter 2 Global Economic Impact on Industry

Chapter 3 Global Market Competition by Manufacturers

Chapter 4 Global Production, Revenue (Value) by Region.

Chapter 5 Global Supply (Production), Consumption, Export, Import by Regions

Chapter 6 Global Production, Revenue (Value), Price Trend by Type

Chapter 7 Global Market Analysis by Application

Chapter 8 Manufacturing Cost Analysis

Chapter 9 Industrial Chain, Sourcing Strategy and Downstream Buyers

Chapter 10 Marketing Strategy Analysis, Distributors/Traders

Chapter 11 Market Effect Factors Analysis

Chapter 12 Global Neural Network Software Market Forecast

Read the full analysis report for a better understanding (description, TOC, list of tables and figures, and much more):

https://www.marketinsightsreports.com/reports/031012040629/global-neural-network-software-market-growth-status-and-outlook-2023-2029?Mode=Naruto

The research provides answers to the following key questions:

-What is the projected market size of the Neural Network Software market by 2029?

-What will be the normal portion of the overall industry for impending years?

-What is the significant development driving components and restrictions of the worldwideNeural Network Software marketacross different geographic?

-Who are the key sellers expected to lead the market for the appraisal time frame 2023 to 2029?

-What is the moving and arising advances expected to influence the advancement of the worldwide market?

-What are the development techniques received by the significant market sellers to remain ahead on the lookout?

Global Neural Network Software Market Report: Key Features

A comprehensive global and regional analysis of the Neural Network Software market is also cited in this report. Provides detailed coverage of all industry segments in the Neural Network Software market to evaluate potential trends, development strategies, and industry size estimations as of 2029. The report referred to an in-depth assessment of companies that function in the global Neural Network Software market. Each industry participants company profile includes industry portfolio examination, sales revenue, SWOT analysis and recent developments. Growth projections examine product segments and regions where industry-leading contributors should focus on investment trends, production/consumption ratios, and more.

Key Benefits for Industry Participants and Stakeholders

Custom services available with the report:

20% free customization.

You can add 5 countries according to your choice.

You can add 5 companies according to your choice.

Free customization up to 40 hours.

1 year post-delivery support from the date of delivery.

Contact Us:

Irfan Tamboli (Head of Sales) Market Insights Reports

Phone: + 1704 266 3234 | +91-750-707-8687

sales@marketinsightsreports.com |irfan@marketinsightsreports.com

View post:

Neural Network Software Market 2023 Growth Factors and Industry ... - University City Review

Artificial Intelligence Accuracy and Bias Can be Improved through … – Fagen wasanni

A team of researchers from Bucknell Universitys Freeman College of Management have discovered that compressing or pruning machine learning models known as neural networks can lead to improved accuracy and reduced bias in artificial intelligence (AI) applications.

Led by Professor Thiago Serra, the research team found that moderate compression of neural networks makes them more accurate and less biased. Neural networks are a type of AI method that mimics the human brains data processing capabilities.

Supported by a $174,000 grant from the National Science Foundation, the researchers developed exact neural network compression algorithms to reduce their size and improve their accessibility on regular devices. Their work has been accepted for presentation at two notable AI conferences and is expected to influence future AI development.

By removing smaller and less important connections within the neural network, the research team observed improved performance and reduced bias. The compression technique helps the neural network make correct predictions more uniformly for different groups. The teams findings were presented at NeurIPS, a leading conference on neural networks, in December.

In a separate paper, the researchers focused on determining the most efficient connections to remove from the neural network to enhance its efficiency. They extended their prior work to develop mathematical results that guide pruning decisions. This paper was presented at an international conference in France in June.

The research is an outcome of Professor Serras work dating back to 2018 and his labs two years of testing neural networks. The findings have practical implications for leveraging the capabilities of neural networks and improving their future performance.

Overall, the research suggests that neural network compression can play a crucial role in enhancing the accuracy and minimizing bias in AI applications.

Read this article:

Artificial Intelligence Accuracy and Bias Can be Improved through ... - Fagen wasanni

Reinforcement learning allows underwater robots to locate and track … – Science Daily

A team led by the Institut de Cincies del Mar (ICM-CSIC) in Barcelona in collaboration with the Monterey Bay Aquarium Research Institute (MBARI) in Califrnia, the Universitat Politcnica de Catalunya (UPC) and the Universitat de Girona (UdG), proves for the first time that reinforcement learning -i.e., a neural network that learns the best action to perform at each moment based on a series of rewards- allows autonomous vehicles and underwater robots to locate and carefully track marine objects and animals. The details are reported in a paper published in the journal Science Robotics.

Currently, underwater robotics is emerging as a key tool for improving knowledge of the oceans in the face of the many difficulties in exploring them, with vehicles capable of descending to depths of up to 4,000 meters. In addition, the in-situ data they provide help to complement other data, such as that obtained from satellites. This technology makes it possible to study small-scale phenomena, such as CO2 capture by marine organisms, which helps to regulate climate change.

Specifically, this new work reveals that reinforcement learning, widely used in the field of control and robotics, as well as in the development of tools related to natural language processing such as ChatGPT, allows underwater robots to learn what actions to perform at any given time to achieve a specific goal. These action policies match, or even improve in certain circumstances, traditional methods based on analytical development.

"This type of learning allows us to train a neural network to optimize a specific task, which would be very difficult to achieve otherwise. For example, we have been able to demonstrate that it is possible to optimize the trajectory of a vehicle to locate and track objects moving underwater," explains Ivan Masmitj, the lead author of the study, who has worked between ICM-CSIC and MBARI.

This "will allow us to deepen the study of ecological phenomena such as migration or movement at small and large scales of a multitude of marine species using autonomous robots. In addition, these advances will make it possible to monitor other oceanographic instruments in real time through a network of robots, where some can be on the surface monitoring and transmitting by satellite the actions performed by other robotic platforms on the seabed," points out the ICM-CSIC researcher Joan Navarro, who also participated in the study.

To carry out this work, researchers used range acoustic techniques, which allow estimating the position of an object considering distance measurements taken at different points. However, this fact makes the accuracy in locating the object highly dependent on the place where the acoustic range measurements are taken. And this is where the application of artificial intelligence and, specifically, reinforcement learning, which allows the identification of the best points and, therefore, the optimal trajectory to be performed by the robot, becomes important.

Neural networks were trained, in part, using the computer cluster at the Barcelona Supercomputing Center (BSC-CNS), where the most powerful supercomputer in Spain and one of the most powerful in Europe are located. "This made it possible to adjust the parameters of different algorithms much faster than using conventional computers," indicates Prof. Mario Martin, from the Computer Science Department of the UPC and author of the study.

Once trained, the algorithms were tested on different autonomous vehicles, including the AUV Sparus II developed by VICOROB, in a series of experimental missions developed in the port of Sant Feliu de Guxols, in the Baix Empord, and in Monterey Bay (California), in collaboration with the principal investigator of the Bioinspiration Lab at MBARI, Kakani Katija.

"Our simulation environment incorporates the control architecture of real vehicles, which allowed us to implement the algorithms efficiently before going to sea," explains Narcs Palomeras, from the UdG.

For future research, the team will study the possibility of applying the same algorithms to solve more complicated missions. For example, the use of multiple vehicles to locate objects, detect fronts and thermoclines or cooperative algae upwelling through multi-platform reinforcement learning techniques.

This research has been carried out thanks to the European Marie Curie Individual Fellowship won by the researcher Ivan Masmitj in 2020 and the BITER project, funded by the Ministry of Science and Innovation of the Government of Spain, which is currently under implementation.

Read this article:

Reinforcement learning allows underwater robots to locate and track ... - Science Daily

AI helps scientists to eavesdrop on endangered pink dolphins – Nature.com

Botos use clicks and whistles to communicate with each other and to find prey.Credit: Sylvain Cordier/Gamma-Rapho via Getty

Researchers have used artificial intelligence (AI) to map the movements of two endangered species of dolphin in the Amazon River by training a neural network to recognize the animals unique clicks and whistles.

The findings, published in Scientific Reports on 27 July1, could lead to better conservation strategies by helping researchers to build an accurate picture of the dolphins movements across a vast area of rainforest that becomes submerged each year after the rainy season.

Using sound is much less invasive than conventional tracking techniques, such as the use of GPS tags, boats or aerial drones.

Saving the Amazon: how science is helping Indigenous people protect their homelands

Sound is probably the only sense that we know of that we all share on Earth, says co-author Michel Andr, a bioacoustician at the Technical University of Catalonia in Barcelona, Spain.

Andr and his colleagues wanted to explore the activity of two species, the boto (Inia geoffrensis) also known as the pink river dolphin and the tucuxi (Sotalia fluviatilis) across the floodplains of the Mamirau reserve in northern Brazil. The researchers placed underwater microphones at several sites to eavesdrop on the animals whereabouts.

To distinguish the dolphin sounds from the noisy soundscape of the Amazon, they turned to AI, feeding the recordings into a deep-learning neural network capable of categorizing sounds in real time, exactly as we do with our own brain, says Andr.

Using this technology, researchers can analyse volumes of information that would otherwise be almost impossible, says Federico Mosquera-Guerra, who studies Amazonian dolphins at the National University of Colombia in Bogot.

The AI was trained to identify three types of sound: dolphin, rainfall and boat engines. Both dolphin species use echolocation clicks almost constantly to sense their environment, and they communicate to others by whistling. Detecting these clicks and whistles enabled the researchers to map the animals movements. Botos and tucuxis have distinct whistles, so the neural network could distinguish between the species.

The study is part of a collaboration between the Technical University of Catalonia and the Mamirau Institute of Sustainable Development in Tef, Brazil, which aims to use this technology for monitoring the Amazons biodiversity and threats to it.

AI empowers conservation biology

Both dolphin species are endangered: estimates suggest that the boto population is declining by 50% every ten years, and the tucuxi population every nine years2. Monitoring when and where the animals move will allow researchers to help protect their populations and come up with measures to help Indigenous communities to cohabitate with the presence of dolphins, says Andr. Dolphins can disrupt fisheries across the floodplains, for example, by competing for fish or becoming tangled in nets.

Mosquera-Guerra says that collecting such information is fundamental to inform decisions on conservation across the Amazon region.

In future, the team wants to train the neural network to detect other aquatic species, and to deploy the system over a wider area. The same approach could also be used in the ocean. Andrs previous work using this system has shown the effects of human-made noise pollution on sperm whales, and has enabled the development of a warning system for ships to help avoid the animals3.

View original post here:

AI helps scientists to eavesdrop on endangered pink dolphins - Nature.com

Living a Varied Life Boosts Brain Connectivity in Mice – ScienceAlert

The brains in mice benefit from an active and varied lifestyle by forming enhanced neural connections.

Researchers in Germany compared the brain activity of mice raised in different environments and found those raised in an 'enriched' environment had more activity in their hippocampus, suggesting the presence of a more robust and connected neural network.

Because of its central role in learning and memory, the human hippocampus is frequently affected by degenerative brain diseases like Alzheimer's.

"The results by far exceeded our expectations," says neuroscientist and biomedical engineer Hayder Amin from the German Center for Neurodegenerative Diseases (DZNE), "Simplified, one can say that the neurons of mice from the enriched environment were much more interconnected than those raised in standard housing."

The findings, based on Amin and colleagues' "brain-on-chip" technology and computational analysis tools, could help to support and prevent brain dysfunctions and lead to new brain-inspired artificial intelligence methods.

"We have uncovered a wealth of data that illustrates the benefits of a brain shaped by rich experience," says Gerd Kempermann, an adult neurogenesis researcher at DZNE.

The scientists compared brain tissue from two groups of 12-week-old mice whose experiences began from six weeks of age. One group lived in standard cages that had no special features or fun activities to partake in, just food, water, and nesting materials.

The other group had the time of their lives in larger cages with toys, tunnels, plastic tubes fashioned into mazes, extra nesting material, and little houses, which sure does sound like a great weekend, even for a human.

The researchers examined brain tissue using a complementary metaloxidesemiconductor (CMOS) based neurochip with 4,096 electrodes to record the firing of thousands of neurons at once.

They were able to measure connectivity between the entire hippocampus and the brain's outer layer that governs a whole heap of cognition processes, which they grouped into six interconnected hippocampal-cortical regions.

"No matter which parameter we looked at, a richer experience literally boosted connections in the neuronal networks," says Amin. "These findings suggest that leading an active and varied life shapes the brain on whole new grounds."

It's been known for some time that our experiences leave a mark on our brain's connectivity, but this demonstrates just how significant those marks can be.

"All we knew in this area so far has either been taken from studies with single electrodes or imaging techniques like magnetic resonance imaging," Kempermann explains. "Here, we can literally see the circuitry at work down to the scale of single cells."

Amin, Kempermann, and the rest of the team hope their tools could be expanded to look at how social interactions, physical activity, and learning processes, all of which have a big impact on how the brain works, affect the brain's function.

Of course, the results were seen in mice brains, not humans, but studying the entire hippocampus gives them a larger-scale view of functional connectivity.

The scientists think that mapping and understanding how experiences change the connectome could help find the mechanisms that cause brain dysfunctions and identify new targets for more effective treatments in the future.

Their platform could lay the groundwork for prosthetic devices that mimic brain functions to restore and improve memory capabilities lost due to aging or disease.

"This paves the way to understand the role of plasticity and reserve formation in combating neurodegenerative diseases, especially with respect to novel preventive strategies," Kempermann says.

"Also, this will help provide insights into disease processes associated with neurodegeneration, such as dysfunctions of brain networks."

The study has been published in Biosensors and Bioelectronics.

See original here:

Living a Varied Life Boosts Brain Connectivity in Mice - ScienceAlert

International Conference on Machine Learning Draws Machine … – Fagen wasanni

Last week, the International Conference on Machine Learning took place in Oahu, attracting thousands of machine learning researchers, founders, and venture capitalists. Attendees included notable figures such as John Schulman, co-founder of OpenAI and head of their reinforcement learning team, Noam Brown, an OpenAI researcher recognized for creating the first AI to achieve human-level performance in the strategy game Diplomacy, and Paige Bailey, a lead product manager at Google DeepMind.

The conference featured a variety of talks and paper presentations, allowing participants to delve into the latest advancements and research in the field of machine learning. Additionally, attendees had the opportunity to relax and enjoy the sandy beaches of Oahu, as well as indulge in ChatGPT-designed cocktails, including the popular Neural Network Negroni at an event hosted by OpenAI.

One of the highlights of the conference was the OpenAI booth, where participants flocked to learn about the companys latest paper on consistency models. These models are part of a new family of models that aim to enhance the efficiency of image generation compared to diffusion models which power image generators like OpenAIs Dall-e 2. This innovative approach garnered significant interest and discussion among conference-goers.

The International Conference on Machine Learning serves as a critical platform for knowledge sharing, networking, and collaboration within the machine learning community. By bringing together industry leaders, researchers, and investors, the conference fosters an environment conducive to advancing the field of machine learning and exploring its vast potential for various applications.

Read this article:

International Conference on Machine Learning Draws Machine ... - Fagen wasanni

Ghost particles paint a new picture of the Milky Way – Science News Explores

Antarctica: A continent mostly covered in ice, which sits in the southernmost part of the world.

artificial intelligence: A type of knowledge-based decision-making exhibited by machines or computers. The term also refers to the field of study in which scientists try to create machines or computer software capable of intelligent behavior.

astronomer: A scientist who works in the field of research that deals with celestial objects, space and the physical universe.

black hole: A region of space having a gravitational field so intense that no matter or radiation (including light) can escape.

blazar: A bright and distant active galaxy that shoots powerful jets of radiation from its center and directly toward Earth.

colleague: Someone who works with another; a co-worker or team member.

core: Something usually round-shaped in the center of an object. (in geology)Earths innermost layer. Or, a long, tube-like sample drilled down into ice, soil or rock. Cores allow scientists to examine layers of sediment, dissolved chemicals, rock and fossils to see how the environment at one location changed through hundreds to thousands of years or more.

cosmic rays: Very high-energy particles, mostly protons, that bombard Earth from all directions. These particles originate outside our solar system. They are equivalent to the nucleus of an atom. They travel through space at high rates of speed (often close to the speed of light).

cosmos: (adj. cosmic) A term that refers to the universe and everything within it.

electric charge: The physical property responsible for electric force; it can be negative or positive.

galaxy: A group of stars and usually invisible, mysterious dark matter all held together by gravity. Giant galaxies, such as the Milky Way, often have more than 100 billion stars. The dimmest galaxies may have just a few thousand. Some galaxies also have gas and dust from which they make new stars.

gamma rays: High-energy radiation often generated by processes in and around exploding stars. Gamma rays are the most energetic form of light.

mass: A number that shows how much an object resists speeding up and slowing down basically a measure of how much matter that object is made from.

Milky Way: The galaxy in which Earths solar system resides.

neural network: Also known as a neural net. A computer program designed to manage lots of data and in complex ways. These systems consist of many (perhaps millions) of simple, densely linked connections within a computer. Each connection, or node, can perform a simple operation. One node might be connected to several feeder nodes, which send it data. Several more nodes in another layer sit ready to accept the newly processed data and act upon them in some other way. The general idea of networks was initially patterned loosely on the way nerve cells work in the brain to process signals that lead to thought and learning.

neutrino: A subatomic particle with a mass close to zero. Neutrinos rarely react with normal matter. Three kinds of neutrinos are known.

particle: A minute amount of something.

physicist: A scientist who studies the nature and properties of matter and energy.

sensor: A device that picks up information on physical or chemical conditions such as temperature, barometric pressure, salinity, humidity, pH, light intensity or radiation and stores or broadcasts that information. Scientists and engineers often rely on sensors to inform them of conditions that may change over time or that exist far from where a researcher can measure them directly. (in biology) The structure that an organism uses to sense attributes of its environment, such as heat, winds, chemicals, moisture, trauma or an attack by predators.

star: The basic building block from which galaxies are made. Stars develop when gravity compacts clouds of gas. When they become hot enough, stars will emit light and sometimes other forms of electromagnetic radiation. The sun is our closest star.

subatomic: Anything smaller than an atom, which is the smallest bit of matter that has all the properties of whatever chemical element it is (like hydrogen, iron or calcium).

supernova: (plural: supernovae or supernovas) A star that suddenly increases greatly in brightness because of a catastrophic explosion that ejects most (or sometimes all) of its mass.

system: A network of parts that together work to achieve some function. For instance, the blood, vessels and heart are primary components of the human body's circulatory system. Similarly, trains, platforms, tracks, roadway signals and overpasses are among the potential components of a nation's railway system. System can even be applied to the processes or ideas that are part of some method or ordered set of procedures for getting a task done.

telescope: Usually a light-collecting instrument that makes distant objects appear nearer through the use of lenses or a combination of curved mirrors and lenses. Some, however, collect radio emissions (energy from a different portion of the electromagnetic spectrum) through a network of antennas.

universe: The entire cosmos: All things that exist throughout space and time. It has been expanding since its formation during an event known as the Big Bang, some 13.8 billion years ago (give or take a few hundred million years).

X-ray: A type of radiation analogous to gamma rays, but having somewhat lower energy.

Read the original:

Ghost particles paint a new picture of the Milky Way - Science News Explores

On the evaluation of the carbon dioxide solubility in polymers using … – Nature.com

Sheng, J. J. Enhanced Oil Recovery Field Case Studies (Gulf Professional Publishing, 2013).

Google Scholar

Thomas, S. Enhanced oil recoveryan overview. Oil Gas Sci. Technol. Rev. lIFP 63, 919 (2008).

Article CAS Google Scholar

Divandari, H., Amiri-Ramsheh, B. & Zabihi, R. Steam flooding (steam drive). Thermal Methods 20, 47 (2023).

Article Google Scholar

Soleimani, R. et al. Evolving an accurate decision tree-based model for predicting carbon dioxide solubility in polymers. Chem. Eng. Technol. 43, 514522 (2020).

Article CAS Google Scholar

Li, D.-C., Liu, T., Zhao, L. & Yuan, W.-K. Solubility and diffusivity of carbon dioxide in solid-state isotactic polypropylene by the pressure-decay method. Ind. Eng. Chem. Res. 48, 71177124 (2009).

Article CAS Google Scholar

Zheng, H., Mahmoudzadeh, A., Amiri-Ramsheh, B. & Hemmati-Sarapardeh, A. Modeling viscosity of CO2N2 gaseous mixtures using robust tree-based techniques: Extra tree, random forest, GBoost, and LightGBM. ACS Omega 8, 1386313875 (2023).

Article CAS PubMed PubMed Central Google Scholar

Li, M. et al. Solubility prediction of supercritical carbon dioxide in 10 polymers using radial basis function artificial neural network based on chaotic self-adaptive particle swarm optimization and K-harmonic means. RSC Adv. 5, 4552045527 (2015).

Article ADS CAS Google Scholar

Mengshan, L. et al. Prediction of supercritical carbon dioxide solubility in polymers based on hybrid artificial intelligence method integrated with the diffusion theory. RSC Adv. 7, 4981749827 (2017).

Article ADS Google Scholar

Nalawade, S. P., Picchioni, F. & Janssen, L. Supercritical carbon dioxide as a green solvent for processing polymer melts: Processing aspects and applications. Prog. Polym. Sci. 31, 1943 (2006).

Article CAS Google Scholar

Ru-Ting, X. & Xing-Yuan, H. Predictive calculation of carbon dioxide solubility in polymers. RSC Adv. 5, 7697976986 (2015).

Article ADS Google Scholar

Zhang, Q., Vanparijs, N., Louage, B., De Geest, B. G. & Hoogenboom, R. Dual pH-and temperature-responsive RAFT-based block co-polymer micelles and polymerprotein conjugates with transient solubility. Polym. Chem. 5, 11401144 (2014).

Article CAS Google Scholar

Quan, S. et al. A bio-inspired CO2-philic network membrane for enhanced sustainable gas separation. J. Mater. Chem. A 3, 1375813766 (2015).

Article CAS Google Scholar

Han, X. & Poliakoff, M. Continuous reactions in supercritical carbon dioxide: Problems, solutions and possible ways forward. Chem. Soc. Rev. 41, 14281436 (2012).

Article CAS PubMed Google Scholar

Chandra, R. & Rustgi, R. Biodegradable polymers. Prog. Polym. Sci. 23, 12731335 (1998).

Article CAS Google Scholar

Sato, Y. et al. Solubility and diffusion coefficient of carbon dioxide in biodegradable polymers. Ind. Eng. Chem. Res. 39, 48134819 (2000).

Article CAS Google Scholar

Nishioka, M., Tuzuki, T., Wanajyo, Y., Oonami, H. & Horiuchi, T. Studies in Polymer Science Vol 12, 584590 (Elsevier, 1994).

Google Scholar

Yampolskii, Y. & Paterson, R. Solubility of gases in polymers. Exp. Determin. Solubil. 6, 151171 (2003).

CAS Google Scholar

Shah, V., Hardy, B. & Stern, S. Solubility of carbon dioxide, methane, and propane in silicone polymers. Effect of polymer backbone chains. J. Polym. Sci. Part B Polym. Phys. 31, 313317 (1993).

Article ADS CAS Google Scholar

Li, Y.-G. & Mather, A. E. Correlation and prediction of the solubility of carbon dioxide in a mixed alkanolamine solution. Ind. Eng. Chem. Res. 33, 20062015 (1994).

Article CAS Google Scholar

Sato, Y., Yurugi, M., Fujiwara, K., Takishima, S. & Masuoka, H. Solubilities of carbon dioxide and nitrogen in polystyrene under high temperature and pressure. Fluid Phase Equilib. 125, 129138 (1996).

Article CAS Google Scholar

Aubert, J. H. Solubility of carbon dioxide in polymers by the quartz crystal microbalance technique. J. Supercrit. Fluids 11, 163172 (1998).

Article CAS Google Scholar

Webb, K. F. & Teja, A. S. Solubility and diffusion of carbon dioxide in polymers. Fluid Phase Equilib. 158, 10291034 (1999).

Article Google Scholar

Sato, Y., Fujiwara, K., Takikawa, T., Takishima, S. & Masuoka, H. Solubilities and diffusion coefficients of carbon dioxide and nitrogen in polypropylene, high-density polyethylene, and polystyrene under high pressures and temperatures. Fluid Phase Equilib. 162, 261276 (1999).

Article CAS Google Scholar

Hilic, S., Boyer, S. A., Pdua, A. A. & Grolier, J. P. E. Simultaneous measurement of the solubility of nitrogen and carbon dioxide in polystyrene and of the associated polymer swelling. J. Polym. Sci. Part B Polym. Phys. 39, 20632070 (2001).

Article ADS CAS Google Scholar

Sato, Y., Takikawa, T., Takishima, S. & Masuoka, H. Solubilities and diffusion coefficients of carbon dioxide in poly (vinyl acetate) and polystyrene. J. Supercrit. Fluids 19, 187198 (2001).

Article CAS Google Scholar

Park, S. H., Lee, K. B., Hyun, J. C. & Kim, S. H. Correlation and prediction of the solubility of carbon dioxide in aqueous alkanolamine and mixed alkanolamine solutions. Ind. Eng. Chem. Res. 41, 16581665 (2002).

Article CAS Google Scholar

Sato, Y., Takikawa, T., Yamane, M., Takishima, S. & Masuoka, H. Solubility of carbon dioxide in PPO and PPO/PS blends. Fluid Phase Equilib. 194, 847858 (2002).

Article Google Scholar

Hamedi, M., Muralidharan, V., Lee, B. & Danner, R. P. Prediction of carbon dioxide solubility in polymers based on a group-contribution equation of state. Fluid Phase Equilib. 204, 4153 (2003).

Article CAS Google Scholar

Li, G., Li, H., Turng, L., Gong, S. & Zhang, C. Measurement of gas solubility and diffusivity in polylactide. Fluid Phase Equilib. 246, 158166 (2006).

Article CAS Google Scholar

Lei, Z., Ohyabu, H., Sato, Y., Inomata, H. & Smith, R. L. Jr. Solubility, swelling degree and crystallinity of carbon dioxidepolypropylene system. J. Supercrit. Fluids 40, 452461 (2007).

Article CAS Google Scholar

Khajeh, A., Modarress, H. & Rezaee, B. Application of adaptive neuro-fuzzy inference system for solubility prediction of carbon dioxide in polymers. Expert Syst. Appl. 36, 57285732 (2009).

Article Google Scholar

Xu, M., Chen, J., Zhang, C., Du, Z. & Mi, J. A theoretical study of structuresolubility correlations of carbon dioxide in polymers containing ether and carbonyl groups. Phys. Chem. Chem. Phys. 13, 2108421092 (2011).

Article CAS PubMed Google Scholar

Li, M. et al. Prediction of gas solubility in polymers by back propagation artificial neural network based on self-adaptive particle swarm optimization algorithm and chaos theory. Fluid Phase Equilib. 356, 1117 (2013).

Article CAS Google Scholar

Minelli, M. & Sarti, G. C. Permeability and solubility of carbon dioxide in different glassy polymer systems with and without plasticization. J. Membr. Sci. 444, 429439 (2013).

Article CAS Google Scholar

Mengshan, L., Wei, W., Bingsheng, C., Yan, W. & Xingyuan, H. Solubility prediction of gases in polymers based on an artificial neural network: A review. RSC Adv. 7, 3527435282 (2017).

Article ADS Google Scholar

Li, M. et al. Models for the solubility calculation of a CO2/polymer system: A review. Mater. Today Commun. 25, 101277 (2020).

Article CAS Google Scholar

Sun, X. et al. Experiments and modeling of CO2 solubility in water-based and oil-based drilling fluids. J. Petrol. Sci. Eng. 212, 110336 (2022).

Article CAS Google Scholar

Ushiki, I., Kawashima, H., Kihara, S.-I. & Takishima, S. Solubility and diffusivity of supercritical CO2 for polycaprolactone in its molten state: Measurement and modeling using PC-SAFT and free volume theory. J. Supercrit. Fluids 181, 105499 (2022).

Article CAS Google Scholar

Kiran, E., Sarver, J. A. & Hassler, J. C. Solubility and diffusivity of CO2 and N2 in polymers and polymer swelling, glass transition, melting, and crystallization at high pressure: A critical review and perspectives on experimental methods, data, and modeling. J. Supercrit. Fluids 185, 105378 (2022).

Article CAS Google Scholar

Ricci, E., De Angelis, M. G. & Minelli, M. A comprehensive theoretical framework for the sub and supercritical sorption and transport of CO2 in polymers. Chem. Eng. J. 435, 135013 (2022).

Article CAS Google Scholar

Ferreira, C. Gene expression programming: A new adaptive algorithm for solving problems. arXiv preprint cs/0102027 (2001).

Umar, A. A., Saaid, I. M., Sulaimon, A. A. & Pilus, R. M. Predicting the viscosity of petroleum emulsions using gene expression programming (GEP) and response surface methodology (RSM). J. Appl. Math. 20, 20 (2020).

Google Scholar

Zhong, J., Feng, L. & Ong, Y.-S. Gene expression programming: A survey. IEEE Comput. Intell. Mag. 12, 5472 (2017).

Article Google Scholar

Amar, M. N., Larestani, A., Lv, Q., Zhou, T. & Hemmati-Sarapardeh, A. Modeling of methane adsorption capacity in shale gas formations using white-box supervised machine learning techniques. J. Petrol. Sci. Eng. 208, 109226 (2022).

Article Google Scholar

Amar, M. N. Prediction of hydrate formation temperature using gene expression programming. J. Nat. Gas Sci. Eng. 89, 103879 (2021).

Article Google Scholar

Amar, M. N., Ghriga, M. A., Seghier, M. E. A. B. & Ouaer, H. Predicting solubility of nitrous oxide in ionic liquids using machine learning techniques and gene expression programming. J. Taiwan Inst. Chem. Eng. 128, 156168 (2021).

Article Google Scholar

Baniasadi, H., Kamari, A., Heidararabi, S., Mohammadi, A. H. & Hemmati-Sarapardeh, A. Rapid method for the determination of solution gas-oil ratios of petroleum reservoir fluids. J. Nat. Gas Sci. Eng. 24, 500509 (2015).

Article Google Scholar

Rostami, A., Arabloo, M., Kamari, A. & Mohammadi, A. H. Modeling of CO2 solubility in crude oil during carbon dioxide enhanced oil recovery using gene expression programming. Fuel 210, 768782 (2017).

Article CAS Google Scholar

Mirzaie, M. & Tatar, A. Modeling of interfacial tension in binary mixtures of CH4, CO2, and N2-alkanes using gene expression programming and equation of state. J. Mol. Liq. 320, 114454 (2020).

Article CAS Google Scholar

Traore, S., Luo, Y. & Fipps, G. Gene-expression programming for short-term forecasting of daily reference evapotranspiration using public weather forecast information. Water Resour. Manage 31, 48914908 (2017).

Article Google Scholar

Sarapardeh, A. H., Larestani, A., Menad, N. A. & Hajirezaie, S. Applications of Artificial Intelligence Techniques in the Petroleum Industry (Gulf Professional Publishing, 2020).

View post:

On the evaluation of the carbon dioxide solubility in polymers using ... - Nature.com

Hawai’i Education Association awards scholarships to three Big … – Big Island Now

From left: Lacey Alvarez, David Brooks and Chayanee Brooks. Photo Courtesy: HEA

The Hawaii Educational Association, a nonprofit organization founded more than 100 years ago to support the value of the education profession, recently awarded a total of $21,500 in scholarships and grants to 17 individuals at various stages of their teaching careers, from high school graduates just entering college to experienced educators pursuing additional education to broaden their capabilities.

Its exciting to see so many outstanding individuals who are so passionate about the teaching profession, said Joan Lewis, Hawaii Educational Association president.

The Hawaii Educational Association awarded five scholarships to recent high school graduates, three to continuing college students, and nine scholarships to educators. Three Hawaii Island educators received scholarships.

Lacey Alvarez, who has worked with children from preschool to the fifth grade over the past eight years received a $2,000 student-teacher scholarship, sponsored by Helen MacKay Memorial. The resident of Kealakekua on the Kona Coast of Hawaii Island is currently pursuing a bachelors degree in early elementary/special education from the University of Hawaii at Mnoa and the $2,000 Hawaii Educational Association scholarship will help with tuition.

She was the lead teacher at Creative Day Preschool for four years, and has been serving as an educational assistant at Holualoa Elementary School for the past four years, working with children who may have learning disabilities or may need more support while in the classroom.

Two educators from Ka High & Pahala Elementary School received Ronald K. Toma scholarships for professional development for in-service public school educators, ranging from $390 to $1,000 awards.

Chayanee Brooks, a Volcano resident who has taught advanced placement English for high school students at Ka High & Pahala Elementary School since 2013, is pursuing a doctoral degree in neuroscience at the Institute of Molecular Biosciences, Mahidol University, a research institution in Bangkok, Thailand, to better understand diverse neural network connectivity to better serve students.

David Brooks, also a Volcano resident who has been an advanced placement social studies teacher at Ka High & Pahala Elementary School since 2013, is pursuing a masters degree in Asian studies online at Mahidol University, in Bangkok, Thailand. He is developing lesson plans for courses as part of his coursework. With the support of the Hawaii Educational Association grant, he will do research and write a masters thesis aligned with his teaching.

Asian studies, human geography, economics, world history, and global studies classes all will use social, political, and economic content in their curriculums, and my research focus will be intentionally chosen to use in the classroom and to share with other social studies teachers in Hawaii and across the U.S, Brooks explained.

See the rest here:

Hawai'i Education Association awards scholarships to three Big ... - Big Island Now

ChatGPT & Advanced Prompt Engineering: Driving the AI Evolution – Unite.AI

OpenAI has been instrumental in developing revolutionary tools like the OpenAI Gym, designed for training reinforcement algorithms, and GPT-n models. The spotlight is also on DALL-E, an AI model that crafts images from textual inputs. One such model that has garnered considerable attention is OpenAI's ChatGPT, a shining exemplar in the realm of Large Language Models.

ChatGPT has transformed the chatbot landscape, offering human-like responses to user inputs and expanding its applications across domains from software development and testing to business communication, and even the creation of poetry.

In the hands of businesses and individuals, GPT-4, for instance, could serve as an inexhaustible knowledge reservoir, proficient in subjects ranging from Mathematics and Biology to Legal Studies. Such sophisticated and accessible AI models are poised to redefine the future of work, learning, and creativity.

Generative models like GPT-4 can produce new data based on existing inputs. This distinguishing attribute lets them undertake diverse tasks, including generating text, images, music, and video.

In the context of ChatGPT and OpenAI models, a prompt is the input that users provide to the models, usually in the form of text. Prompts serve as the primary means of controlling the output generated by these models, making them crucial for successful and relevant conversations.

Using well-defined prompts is the key to engaging in useful and meaningful conversations with AI systems. Poorly-defined prompts can result in output that is not helpful to the user and may even lead to misleading content.

To demonstrate the impact of prompt quality, let's compare two different prompts given to ChatGPT. The first prompt, poorly-defined and assuming prior knowledge, generates a potentially unhelpful response for the user. On the other hand, the second prompt provides more context and examples, ensuring a more relevant and understandable response.

Prompt 1: Tell me about Convolutional Neural Networks.

Response 1: Convolutional Neural Networks (CNNs) are multi-layer perceptron networks that consist of fully connected layers and pooling layers. They are commonly used in image recognition tasks. The process involves applying filters to input images and downscaling them using pooling layers.

Prompt 2: I want to learn about Convolutional Neural Networks, but I have no prior knowledge of Artificial Neural Networks. Can you please provide a beginner-friendly explanation and an example of how CNNs are used in image recognition?

Response 2: Convolutional Neural Networks (CNNs) are a type of neural network primarily used for image recognition tasks. Unlike other neural networks, they leverage convolutional layers and pooling layers to process images. Imagine you have an image of a cat. CNNs can identify different features in the image, such as whiskers, ears, and tail, by applying filters and downsampling the image using pooling layers. This process makes CNNs highly effective for recognizing objects in images.

By comparing the two responses, it is evident that a well-defined prompt leads to a more relevant and user-friendly response. Prompt design and engineering are growing disciplines that aim to optimize the output quality of AI models like ChatGPT.

In the following sections of this article, we will delve into the realm of advanced methodologies aimed at refining Large Language Models (LLMs), such as prompt engineering techniques and tactics. These include few-shot learning, ReAct, chain-of-thought, RAG, and more.

Before we proceed, it's important to understand a key issue with LLMs, referred to as hallucination'. In the context of LLMs, hallucination' signifies the tendency of these models to generate outputs that might seem reasonable but are not rooted in factual reality or the given input context.

This problem was starkly highlighted in a recent court case where a defense attorney used ChatGPT for legal research. The AI tool, faltering due to its hallucination problem, cited non-existent legal cases. This misstep had significant repercussions, causing confusion and undermining credibility during the proceedings. This incident serves as a stark reminder of the urgent need to address the issue of hallucination' in AI systems.

Our exploration into prompt engineering techniques aims to improve these aspects of LLMs. By enhancing their efficiency and safety, we pave the way for innovative applications such as information extraction. Furthermore, it opens doors to seamlessly integrating LLMs with external tools and data sources, broadening the range of their potential uses.

Generative Pretrained Transformers (GPT-3) marked an important turning point in the development of Generative AI models, as it introduced the concept of few-shot learning.' This method was a game-changer due to its capability of operating effectively without the need for comprehensive fine-tuning. The GPT-3 framework is discussed in the paper, Language Models are Few Shot Learners where the authors demonstrate how the model excels across diverse use cases without necessitating custom datasets or code.

Unlike fine-tuning, which demands continuous effort to solve varying use cases, few-shot models demonstrate easier adaptability to a broader array of applications. While fine-tuning might provide robust solutions in some cases, it can be expensive at scale, making the use of few-shot models a more practical approach, especially when integrated with prompt engineering.

Imagine you're trying to translate English to French. In few-shot learning, you would provide GPT-3 with a few translation examples like sea otter -> loutre de mer. GPT-3, being the advanced model it is, is then able to continue providing accurate translations. In zero-shot learning, you wouldn't provide any examples, and GPT-3 would still be able to translate English to French effectively.

The term few-shot learning' comes from the idea that the model is given a limited number of examples to learn' from. It's important to note that learn' in this context doesn't involve updating the model's parameters or weights, rather, it influences the model's performance.

Few Shot Learning as Demonstrated in GPT-3 Paper

Zero-shot learning takes this concept a step further. In zero-shot learning, no examples of task completion are provided in the model. The model is expected to perform well based on its initial training, making this methodology ideal for open-domain question-answering scenarios such as ChatGPT.

In many instances, a model proficient in zero-shot learning can perform well when provided with few-shot or even single-shot examples. This ability to switch between zero, single, and few-shot learning scenarios underlines the adaptability of large models, enhancing their potential applications across different domains.

Zero-shot learning methods are becoming increasingly prevalent. These methods are characterized by their capability to recognize objects unseen during training. Here is a practical example of a Few-Shot Prompt:

"Translate the following English phrases to French:

'sea otter' translates to 'loutre de mer' 'sky' translates to 'ciel' 'What does 'cloud' translate to in French?'"

By providing the model with a few examples and then posing a question, we can effectively guide the model to generate the desired output. In this instance, GPT-3 would likely correctly translate cloud' to nuage' in French.

We will delve deeper into the various nuances of prompt engineering and its essential role in optimizing model performance during inference. We'll also look at how it can be effectively used to create cost-effective and scalable solutions across a broad array of use cases.

As we further explore the complexity of prompt engineering techniques in GPT models, it's important to highlight our last post Essential Guide to Prompt Engineering in ChatGPT. This guide provides insights into the strategies for instructing AI models effectively across a myriad of use cases.

In our previous discussions, we delved into fundamental prompt methods for large language models (LLMs) such as zero-shot and few-shot learning, as well as instruction prompting. Mastering these techniques is crucial for navigating the more complex challenges of prompt engineering that we'll explore here.

Few-shot learning can be limited due to the restricted context window of most LLMs. Moreover, without the appropriate safeguards, LLMs can be misled into delivering potentially harmful output. Plus, many models struggle with reasoning tasks or following multi-step instructions.

Given these constraints, the challenge lies in leveraging LLMs to tackle complex tasks. An obvious solution might be to develop more advanced LLMs or refine existing ones, but that could entail substantial effort. So, the question arises: how can we optimize current models for improved problem-solving?

Equally fascinating is the exploration of how this technique interfaces with creative applications in Unite AI's Mastering AI Art: A Concise Guide to Midjourney and Prompt Engineering which describes how the fusion of art and AI can result in awe-inspiring art.

Chain-of-thought prompting leverages the inherent auto-regressive properties of large language models (LLMs), which excel at predicting the next word in a given sequence. By prompting a model to elucidate its thought process, it induces a more thorough, methodical generation of ideas, which tends to align closely with accurate information. This alignment stems from the model's inclination to process and deliver information in a thoughtful and ordered manner, akin to a human expert walking a listener through a complex concept. A simple statement like walk me through step by step how to is often enough to trigger this more verbose, detailed output.

While conventional CoT prompting requires pre-training with demonstrations, an emerging area is zero-shot CoT prompting. This approach, introduced by Kojima et al. (2022), innovatively adds the phrase Let's think step by step to the original prompt.

Let's create an advanced prompt where ChatGPT is tasked with summarizing key takeaways from AI and NLP research papers.

In this demonstration, we will use the model's ability to understand and summarize complex information from academic texts. Using the few-shot learning approach, let's teach ChatGPT to summarize key findings from AI and NLP research papers:

1. Paper Title: "Attention Is All You Need" Key Takeaway: Introduced the transformer model, emphasizing the importance of attention mechanisms over recurrent layers for sequence transduction tasks.

2. Paper Title: "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding" Key Takeaway: Introduced BERT, showcasing the efficacy of pre-training deep bidirectional models, thereby achieving state-of-the-art results on various NLP tasks.

Now, with the context of these examples, summarize the key findings from the following paper:

Paper Title: "Prompt Engineering in Large Language Models: An Examination"

This prompt not only maintains a clear chain of thought but also makes use of a few-shot learning approach to guide the model. It ties into our keywords by focusing on the AI and NLP domains, specifically tasking ChatGPT to perform a complex operation which is related to prompt engineering: summarizing research papers.

React, or Reason and Act, was introduced by Google in the paper ReAct: Synergizing Reasoning and Acting in Language Models, and revolutionized how language models interact with a task, prompting the model to dynamically generate both verbal reasoning traces and task-specific actions.

Imagine a human chef in the kitchen: they not only perform a series of actions (cutting vegetables, boiling water, stirring ingredients) but also engage in verbal reasoning or inner speech (now that the vegetables are chopped, I should put the pot on the stove). This ongoing mental dialogue helps in strategizing the process, adapting to sudden changes (I'm out of olive oil, I'll use butter instead), and remembering the sequence of tasks. React mimics this human ability, enabling the model to quickly learn new tasks and make robust decisions, just like a human would under new or uncertain circumstances.

React can tackle hallucination, a common issue with Chain-of-Thought (CoT) systems. CoT, although an effective technique, lacks the capacity to interact with the external world, which could potentially lead to fact hallucination and error propagation. React, however, compensates for this by interfacing with external sources of information. This interaction allows the system to not only validate its reasoning but also update its knowledge based on the latest information from the external world.

The fundamental working of React can be explained through an instance from HotpotQA, a task requiring high-order reasoning. On receiving a question, the React model breaks down the question into manageable parts and creates a plan of action. The model generates a reasoning trace (thought) and identifies a relevant action. It may decide to look up information about the Apple Remote on an external source, like Wikipedia (action), and updates its understanding based on the obtained information (observation). Through multiple thought-action-observation steps, ReAct can retrieve information to support its reasoning while refining what it needs to retrieve next.

HotpotQA is a dataset, derived from Wikipedia, composed of 113k question-answer pairs designed to train AI systems in complex reasoning, as questions necessitate reasoning over multiple documents to answer. On the other hand, CommonsenseQA 2.0, constructed through gamification, includes 14,343 yes/no questions and is designed to challenge AI's understanding of common sense, as the questions are intentionally crafted to mislead AI models.

The process could look something like this:

The result is a dynamic, reasoning-based process that can evolve based on the information it interacts with, leading to more accurate and reliable responses.

Comparative visualization of four prompting methods Standard, Chain-of-Thought, Act-Only, and ReAct, in solving HotpotQA and AlfWorld (https://arxiv.org/pdf/2210.03629.pdf)

Designing React agents is a specialized task, given its ability to achieve intricate objectives. For instance, a conversational agent, built on the base React model, incorporates conversational memory to provide richer interactions. However, the complexity of this task is streamlined by tools such as Langchain, which has become the standard for designing these agents.

The paper Context-faithful Prompting for Large Language Models underscores that while LLMs have shown substantial success in knowledge-driven NLP tasks, their excessive reliance on parametric knowledge can lead them astray in context-sensitive tasks. For example, when a language model is trained on outdated facts, it can produce incorrect answers if it overlooks contextual clues.

This problem is apparent in instances of knowledge conflict, where the context contains facts differing from the LLM's pre-existing knowledge. Consider an instance where a Large Language Model (LLM), primed with data before the 2022 World Cup, is given a context indicating that France won the tournament. However, the LLM, relying on its pretrained knowledge, continues to assert that the previous winner, i.e., the team that won in the 2018 World Cup, is still the reigning champion. This demonstrates a classic case of knowledge conflict'.

In essence, knowledge conflict in an LLM arises when new information provided in the context contradicts the pre-existing knowledge the model has been trained on. The model's tendency to lean on its prior training rather than the newly provided context can result in incorrect outputs. On the other hand, hallucination in LLMs is the generation of responses that may seem plausible but are not rooted in the model's training data or the provided context.

Another issue arises when the provided context doesnt contain enough information to answer a question accurately, a situation known as prediction with abstention. For instance, if an LLM is asked about the founder of Microsoft based on a context that does not provide this information, it should ideally abstain from guessing.

More Knowledge Conflict and the Power of Abstention Examples

To improve the contextual faithfulness of LLMs in these scenarios, the researchers proposed a range of prompting strategies. These strategies aim to make the LLMs' responses more attuned to the context rather than relying on their encoded knowledge.

One such strategy is to frame prompts as opinion-based questions, where the context is interpreted as a narrator's statement, and the question pertains to this narrator's opinion. This approach refocuses the LLM's attention to the presented context rather than resorting to its pre-existing knowledge.

Adding counterfactual demonstrations to prompts has also been identified as an effective way to increase faithfulness in cases of knowledge conflict. These demonstrations present scenarios with false facts, which guide the model to pay closer attention to the context to provide accurate responses.

Instruction fine-tuning is a supervised learning phase that capitalizes on providing the model with specific instructions, for instance, Explain the distinction between a sunrise and a sunset. The instruction is paired with an appropriate answer, something along the lines of, A sunrise refers to the moment the sun appears over the horizon in the morning, while a sunset marks the point when the sun disappears below the horizon in the evening. Through this method, the model essentially learns how to adhere to and execute instructions.

This approach significantly influences the process of prompting LLMs, leading to a radical shift in the prompting style. An instruction fine-tuned LLM permits immediate execution of zero-shot tasks, providing seamless task performance. If the LLM is yet to be fine-tuned, a few-shot learning approach may be required, incorporating some examples into your prompt to guide the model toward the desired response.

Instruction Tuning with GPT-4 discusses the attempt to use GPT-4 to generate instruction-following data for fine-tuning LLMs. They used a rich dataset, comprising 52,000 unique instruction-following entries in both English and Chinese.

The dataset plays a pivotal role in instruction tuning LLaMA models, an open-source series of LLMs, resulting in enhanced zero-shot performance on new tasks. Noteworthy projects such as Stanford Alpaca have effectively employed Self-Instruct tuning, an efficient method of aligning LLMs with human intent, leveraging data generated by advanced instruction-tuned teacher models.

The primary aim of instruction tuning research is to boost the zero and few-shot generalization abilities of LLMs. Further data and model scaling can provide valuable insights. With the current GPT-4 data size at 52K and the base LLaMA model size at 7 billion parameters, there is enormous potential to collect more GPT-4 instruction-following data and combine it with other data sources leading to the training of larger LLaMA models for superior performance.

The potential of LLMs is particularly visible in complex reasoning tasks such as mathematics or commonsense question-answering. However, the process of inducing a language model to generate rationalesa series of step-by-step justifications or chain-of-thoughthas its set of challenges. It often requires the construction of large rationale datasets or a sacrifice in accuracy due to the reliance on only few-shot inference.

Self-Taught Reasoner (STaR) offers an innovative solution to these challenges. It utilizes a simple loop to continuously improve a model's reasoning capability. This iterative process starts with generating rationales to answer multiple questions using a few rational examples. If the generated answers are incorrect, the model tries again to generate a rationale, this time giving the correct answer. The model is then fine-tuned on all the rationales that resulted in correct answers, and the process repeats.

STaR methodology, demonstrating its fine-tuning loop and a sample rationale generation on CommonsenseQA dataset (https://arxiv.org/pdf/2203.14465.pdf)

To illustrate this with a practical example, consider the question What can be used to carry a small dog? with answer choices ranging from a swimming pool to a basket. The STaR model generates a rationale, identifying that the answer must be something capable of carrying a small dog and landing on the conclusion that a basket, designed to hold things, is the correct answer.

STaR's approach is unique in that it leverages the language model's pre-existing reasoning ability. It employs a process of self-generation and refinement of rationales, iteratively bootstrapping the model's reasoning capabilities. However, STaRs loop has its limitations. The model may fail to solve new problems in the training set because it receives no direct training signal for problems it fails to solve. To address this issue, STaR introduces rationalization. For each problem the model fails to answer correctly, it generates a new rationale by providing the model with the correct answer, which enables the model to reason backward.

STaR, therefore, stands as a scalable bootstrapping method that allows models to learn to generate their own rationales while also learning to solve increasingly difficult problems. The application of STaR has shown promising results in tasks involving arithmetic, math word problems, and commonsense reasoning. On CommonsenseQA, STaR improved over both a few-shot baseline and a baseline fine-tuned to directly predict answers and performed comparably to a model that is 30 larger.

The concept of Tagged Context Prompts revolves around providing the AI model with an additional layer of context by tagging certain information within the input. These tags essentially act as signposts for the AI, guiding it on how to interpret the context accurately and generate a response that is both relevant and factual.

Imagine you are having a conversation with a friend about a certain topic, let's say chess'. You make a statement and then tag it with a reference, such as (source: Wikipedia)'. Now, your friend, who in this case is the AI model, knows exactly where your information is coming from. This approach aims to make the AI's responses more reliable by reducing the risk of hallucinations, or the generation of false facts.

A unique aspect of tagged context prompts is their potential to improve the contextual intelligence' of AI models. For instance, the paper demonstrates this using a diverse set of questions extracted from multiple sources, like summarized Wikipedia articles on various subjects and sections from a recently published book. The questions are tagged, providing the AI model with additional context about the source of the information.

This extra layer of context can prove incredibly beneficial when it comes to generating responses that are not only accurate but also adhere to the context provided, making the AI's output more reliable and trustworthy.

OpenAI's ChatGPT showcases the uncharted potential of Large Language Models (LLMs) in tackling complex tasks with remarkable efficiency. Advanced techniques such as few-shot learning, ReAct prompting, chain-of-thought, and STaR, allow us to harness this potential across a plethora of applications. As we dig deeper into the nuances of these methodologies, we discover how they're shaping the landscape of AI, offering richer and safer interactions between humans and machines.

Despite the challenges such as knowledge conflict, over-reliance on parametric knowledge, and potential for hallucination, these AI models, with the right prompt engineering, have proven to be transformative tools. Instruction fine-tuning, context-faithful prompting, and integration with external data sources further amplify their capability to reason, learn, and adapt.

Go here to read the rest:

ChatGPT & Advanced Prompt Engineering: Driving the AI Evolution - Unite.AI

5G Advanced and Wireless AI Set To Transform Cellular Networks … – Counterpoint Research

The recent surge in interest in generative AI highlights the critical role that AI will play in future wireless systems. With the transition to 5G, wireless systems have become increasingly complex and more challenging to manage, forcing the wireless industry to think beyond traditional rules-based design methods.

5G Advanced will expand the role of wireless AI across 5G networks introducing new, innovative AI applications that will enhance the design and operation of networks and devices over the next three to five years. Indeed, wireless AI is set to become a key pillar of 5G Advanced and will play a critical role in the end-to-end (E2E) design and optimization of wireless systems. In the case of 6G, wireless AI will become native and all-pervasive, operating autonomously between devices and networks and across all protocols and network layers.

E2E Systems Optimization

AI has already been used in smartphones and other devices for several years and is now increasingly being used in the network. However, AI is currently implemented independently, i.e. either on the device or in the network. As a result, E2E systems performance optimization across devices and network has not been fully realized yet. One of the reasons for this is that on-device AI training has not been possible until recently.

On-device AI will play a key role in improving the E2E optimization of 5G networks, bringing important benefits for operators and users, as well as overcoming key challenges. Firstly, on-device AI enables processing to be distributed over millions of devices thus harnessing the aggregated computational power of all these devices. Secondly, it enables AI model learning to be customized to a particular users personalized data. Finally, this personalized data stays local on the device and is not shared with the cloud. This improves reliability and alleviates data sovereignty concerns. On-device AI will not be limited to just smartphones but will be implemented across all kinds of devices from consumer devices to sensors and a plethora of industrial equipment.

New AI-native processors are being developed to implement on-device AI and other AI-based applications. A good example is Qualcomms new Snapdragon X75 5G modem-RF chip, which has a dedicated hardware tensor accelerator. Using Qualcomms own AI implementation, this Gen 2 AI processor boosts the X75s AI performance more than 2.5 times compared to the previous Gen 1 design.

While on-device AI will play a key role in improving the E2E performance of 5G networks, overall systems optimization is limited when AI is implemented independently. To enable true E2E performance optimization, AI training and inference needs to be done on a systems-wide basis, i.e. collaboratively across both the network and the devices. Making this a reality in wireless system design requires not only AI know-how but also deep wireless domain knowledge. This so-called cross-node AI is a key focus of 5G Advanced with a number of use cases being defined in 3GPPs Release 18 specification and further use cases expected to be added in later releases.

Wireless AI: 5G Advanced Release 18 Use Cases

3GPPs Release 18 is the starting point for more extensive use of wireless AI expected in 6G. Three use cases have been prioritized for study in this release:

Channel State Feedback:

CSI is used to determine the propagation characteristics of the communication link between a base station and a user device and describes how this propagation is affected by the local radio environment. Accurate CSI data is essential to provide reliable communications. With traditional model-based CSI, the user device compresses the downlink CSI data and feeds the compressed data back to the base station. Despite this compression, the signalling overhead can still be significant, particularly in the case of massive MIMO radios, reducing the devices uplink capacity and adversely affecting its battery life.

An alternative approach is to use AI to track the various parameters of the communications link. In contrast to model-based CSI, a data driven air interface can dynamically learn from its environment to improve performance and efficiency. AI-based channel estimation thus overcomes many of the limitations of model-based CSI feedback techniques resulting in higher accuracy and hence an improved link performance. The is particularly effective at the edges of a cell.

Implementing ML-based CSI feedback, however, can be challenging in a system with multiple vendors. To overcome this, Qualcomm has developed a sequential training technique which avoids the need to share data across vendors. With this approach, the user device is firstly trained using its own data. Then, the same data is used to train the network. This eliminates the need to share proprietary, neural network models across vendors. Qualcomm has successfully demonstrated sequential training on massive MIMO radios at its 3.5GHz test network in San Diego (Exhibit 1).

Exhibit 1: Realizing system capacity gain even in challenging non-LOS communication

AI-based Millimetre Wave Beam Management:

The second use case involves the use of ML to improve beam prediction on millimetre wave radios. Rather than continuously measuring all beams, ML is used to intelligently select the most appropriate beams to be measured as and when needed. A ML algorithm is then used to predict future beams by interpolating between the beams selected i.e. without the need to measure the beams all the time. This is done at both the device and the base station. As with CSI feedback, this improves network throughput and reduces power consumption.

Qualcomm recently demonstrated the use of ML-based algorithms on its 28GHz massive MIMO test network and showed that the performance of the AI-based system was equivalent to a base case network set-up where all beams are measured.

Precise Positioning:

The third use case involves the use of ML to enable precise positioning. Qualcomm has demonstrated the use of multi-cell roundtrip (RTT) and angle-of-arrival (AoA)-based positioning in an outdoor network in San Diego. The vendor also demonstrated how ML-based positioning with RF finger printing can be used to overcome challenging non-line of sight channel conditions in indoor industrial private networks.

An AI-Native 6G Air Interface

6G will need to deliver a significant leap in performance and spectrum efficiency compared to 5G if it is to deliver even faster data rates and more capacity while enabling new 6G use cases. To do this, the 6G air interface will need to accommodate higher-order Giga MIMO radios capable of operating in the upper mid-band spectrum (7-16GHz), support wider bandwidths in new sub-THz 6G bands (100GHz+) as well as on existing 5G bands. In addition, 6G will need to accommodate a far broader range of devices and services plus support continuous innovation in air interface design.

To meet these requirements, the 6G air interface must be designed to be AI native from the outset, i.e. 6G will largely move away from the traditional, model-driven approach of designing communications networks and transition toward a data-driven design, in which ML is integrated across all protocols and layers with distributed learning and inference implemented across devices and networks.

This will be a truly disruptive change to the way communication systems have been designed in the past but will offer many benefits. For example, through self-learning, an AI-native air interface design will be able to support continuous performance improvements, where both sides of the air interface the network and device can dynamically adapt to their surroundings and optimize operations based on local conditions.

5G Advanced wireless AI/ML will be the foundation for much more AI innovation in 6G and will result in many new network capabilities. For instance, the ability of the 6G AI native air interface to refine existing communication protocols and learn new protocols coupled with the ability to offer E2E network optimization will result in wireless networks that can be dynamically customized to suit specific deployment scenarios, radio environments and use cases. This will a boon for operators, enabling them to automatically adapt their networks to target a range of applications, including various niche and vertical-specific markets.

See the original post:

5G Advanced and Wireless AI Set To Transform Cellular Networks ... - Counterpoint Research

Applications of Traffic Flow Forecasting part3 | by Monodeep … – Medium

Photo by Joseph Chan on Unsplash

Author : Aosong Feng, Leandros Tassiulas

Abstract : Traffic flow forecasting on graphs has real-world applications in many fields, such as transportation system and computer networks. Traffic forecasting can be highly challenging due to complex spatial-temporal correlations and non-linear traffic patterns. Existing works mostly model such spatial-temporal dependencies by considering spatial correlations and temporal correlations separately and fail to model the direct spatial-temporal correlations. Inspired by the recent success of transformers in the graph domain, in this paper, we propose to directly model the cross-spatial-temporal correlations on the spatial-temporal graph using local multi-head self-attentions. To reduce the time complexity, we set the attention receptive field to the spatially neighboring nodes, and we also introduce an adaptive graph to capture the hidden spatial-temporal dependencies. Based on these attention mechanisms, we propose a novel Adaptive Graph Spatial-Temporal Transformer Network (ASTTN), which stacks multiple spatial-temporal attention layers to apply self-attention on the input graph, followed by linear layers for predictions. Experimental results on public traffic network datasets, METR-LA PEMS-BAY, PeMSD4, and PeMSD7, demonstrate the superior performance of our model.

2.A Correlation Information-based Spatiotemporal Network for Traffic Flow Forecasting (arXiv)

Author : Weiguo Zhu, Yongqi Sun, Xintong Yi, Yan Wang

Abstract : The technology of traffic flow forecasting plays an important role in intelligent transportation systems. Based on graph neural networks and attention mechanisms, most previous works utilize the transformer architecture to discover spatiotemporal dependencies and dynamic relationships. However, they have not considered correlation information among spatiotemporal sequences thoroughly. In this paper, based on the maximal information coefficient, we present two elaborate spatiotemporal representations, spatial correlation information (SCorr) and temporal correlation information (TCorr). Using SCorr, we propose a correlation information-based spatiotemporal network (CorrSTN) that includes a dynamic graph neural network component for integrating correlation information into spatial structure effectively and a multi-head attention component for modeling dynamic temporal dependencies accurately. Utilizing TCorr, we explore the correlation pattern among different periodic data to identify the most relevant data, and then design an efficient data selection scheme to further enhance model performance. The experimental results on the highway traffic flow (PEMS07 and PEMS08) and metro crowd flow (HZME inflow and outflow) datasets demonstrate that CorrSTN outperforms the state-of-the-art methods in terms of predictive performance. In particular, on the HZME (outflow) dataset, our model makes significant improvements compared with the ASTGNN model by 12.7%, 14.4% and 27.4% in the metrics of MAE, RMSE and MAPE, respectively

Continue reading here:

Applications of Traffic Flow Forecasting part3 | by Monodeep ... - Medium

A New Attack Impacts ChatGPTand No One Knows How to Stop It – WIRED

Making models more resistant to prompt injection and other adversarial jailbreaking measures is an area of active research, says Michael Sellitto, interim head of policy and societal impacts at Anthropic. We are experimenting with ways to strengthen base model guardrails to make them more harmless, while also investigating additional layers of defense.

ChatGPT and its brethren are built atop large language models, enormously large neural network algorithms geared toward using language that has been fed vast amounts of human text, and which predict the characters that should follow a given input string.

These algorithms are very good at making such predictions, which makes them adept at generating output that seems to tap into real intelligence and knowledge. But these language models are also prone to fabricating information, repeating social biases, and producing strange responses as answers prove more difficult to predict.

Adversarial attacks exploit the way that machine learning picks up on patterns in data to produce aberrant behaviors. Imperceptible changes to images can, for instance, cause image classifiers to misidentify an object, or make speech recognition systems respond to inaudible messages.

Developing such an attack typically involves looking at how a model responds to a given input and then tweaking it until a problematic prompt is discovered. In one well-known experiment, from 2018, researchers added stickers to stop signs to bamboozle a computer vision system similar to the ones used in many vehicle safety systems. There are ways to protect machine learning algorithms from such attacks, by giving the models additional training, but these methods do not eliminate the possibility of further attacks.

Armando Solar-Lezama, a professor in MITs college of computing, says it makes sense that adversarial attacks exist in language models, given that they affect many other machine learning models. But he says it is extremely surprising that an attack developed on a generic open source model should work so well on several different proprietary systems.

Solar-Lezama says the issue may be that all large language models are trained on similar corpora of text data, much of it downloaded from the same websites. I think a lot of it has to do with the fact that there's only so much data out there in the world, he says. He adds that the main method used to fine-tune models to get them to behave, which involves having human testers provide feedback, may not, in fact, adjust their behavior that much.

Solar-Lezama adds that the CMU study highlights the importance of open source models to open study of AI systems and their weaknesses. In May, a powerful language model developed by Meta was leaked, and the model has since been put to many uses by outside researchers.

The outputs produced by the CMU researchers are fairly generic and do not seem harmful. But companies are rushing to use large models and chatbots in many ways. Matt Fredrikson, another associate professor at CMU involved with the study, says that a bot capable of taking actions on the web, like booking a flight or communicating with a contact, could perhaps be goaded into doing something harmful in the future with an adversarial attack.

To some AI researchers, the attack primarily points to the importance of accepting that language models and chatbots will be misused. Keeping AI capabilities out of the hands of bad actors is a horse that's already fled the barn, says Arvind Narayanan, a computer science professor at Princeton University.

Narayanan says he hopes that the CMU work will nudge those who work on AI safety to focus less on trying to align models themselves and more on trying to protect systems that are likely to come under attack, such as social networks that are likely to experience a rise in AI-generative disinformation.

Solar-Lezama of MIT says the work is also a reminder to those who are giddy with the potential of ChatGPT and similar AI programs. Any decision that is important should not be made by a [language] model on its own, he says. In a way, its just common sense.

See more here:

A New Attack Impacts ChatGPTand No One Knows How to Stop It - WIRED

Four Ways to Build AI Tools Without Knowing How to Code – Lifehacker

This post is part of Lifehackers Living With AI series: We investigate the current state of AI, walk through how it can be useful (and how it cant), and evaluate where this revolutionary tech is heading next. Read more here.

Theres a lot of talk about how AI is going to change your life. But unless you know how to code, and are deeply aware of the latest advancements in AI tech, you likely assume you have no part to play here. (I know I did.) But as it turns out, there are companies out there designing programs to help you build AI tools without needing a lick of code.

The idea behind no-code is simple: Everyone should have the accessibility to build program, tools, and other digital services regardless of their level of coding experience. While some take a low-code approach, which still requires some coding knowledge, the services on this list are strictly no-code. Specifically, theyre no-code solutions to building AI tools.

You dont need to be a computer scientist to build your own AI tools. You dont even need to know how to code. You can train a neural network to identify a specific type of plant, or build a simple chatbot to help customers solve issues on your website.

That being said, keep your expectations in check here: The best AI tools are going to require extensive knowledge of both computer science and coding. But its good to know there are utilities out there ready to help you build practical AI tools from scratch, without needing to know much about coding (or tech) in the first place.

If training a machine learning model sounds like something reserved for the AI experts, think again. While its true that machine learning is a complicated practice, theres a way to build you own model for free with as few tools as a laptop and a webcam.

Thats thanks to a program called Lobe: The free app, owned by Microsoft, makes it easy to build your own machine learning model to recognize whatever you want. Need your app to differentiate between colors? You can train it to do that. Want to make a program that can identify different types of plants? Train away.

You can see from the example video that you can train a model to identify when someone is drinking from a cup in only a few minutes. While you can include any images you may have previously taken, you can also simply snap some photos of you drinking from a cup from your webcam. Once you take enough sample photos of you drinking and not drinking, you can use those photos to train the model.

You can then test the model to see how well (or not) it can predict if youre drinking from a cup. In this example, it does a great job whenever it sees the cup in hand, but it incorrectly identifies holding a hand to your face as drinking as well. You can use feedback buttons to tell the model when it gets something wrong, so it can quickly retrain itself based on this information and hopefully make more accurate predictions going forward.

Google also has a similar tool for training simple machine-learning models called Teachable Machine, if youd like to compare its offering to Microsofts.

AI chatbots are all the rage lately. ChatGPT, of course, kicked off the modern AI craze because of its accessible yet powerful chat features, but everything from Facebook Messenger to healthcare sites have used chatbots for years. While OpenAI built ChatGPT with years of expertise, you can make your own chatbot without typing a single line of code.

Juji Studio wants to make building a light version of ChatGPT, in the companys words, as easy as making PowerPoint slides. The program gives you the tools to build a working chatbot you can implement into your site or Facebook Messenger. That includes controlling the flow of the chatbot, adjusting its personality, and feeding it a Q&A list so it can accurately answer specific questions users might have.

Juji lets you start with a blank canvas, or base your chatbot on one of its existing templates. Templates range from customer service bots, job interview bots, teaching assistant bots, and bots that can issue user experience surveys. No matter what you choose, youll see the brains of your bot in a column on the left side of the screen.

It really does resemble PowerPoint slides: Each slide corresponds to a different task for the chatbot to follow. For example, with the customer service chatbot, you have a invite user questions until done slide, which is pre-programmed to listen to user questions until the user gives a done signal. You can go in and customize the prompts the chatbot will ask the user, such as asking for an account number or email address, or even more personal questions, like asking about a bad experience the user had, or the best part of their day.

You can, of course, customize the entire experience to your needs. You can build a bot that changes its approach based on whether or not the user responds positively or negatively to an opinion-based question:

While Lobe is a great resource for training AI models with simple images, Akkio is the no-code AI tool for anyone looking to build AI models from their business data. You can pull data from sources like Salesforce, Snowflake, Google Sheets, Google BigQuery, and HubSpot (although Akkio says they can likely add data from another source for you).

On the surface, you can leverage AI to see your data in new ways. For example, you can ask Akkio to create a new column in your data set for average job length, and it will intelligently build you a new column by combining other data sets like TotalWorkingYears and NumCompaniesWorked.

But while AI-assisted tools are always welcome, the reason Akkio is on this list is because you can train neural networks with your imported data sets. Akkio walks you through the steps necessary to utilize machine learning to do more with your data, and prioritizes the best features for any given problem youre trying to solve. It even splits your data sets into two groups while building your modela training set and a testing setwhich weeds out bias in your new machine learning model. The training set trains the model on the data, while the testing set evaluates the accuracy of your new model.

Akkio will also help ensure your model performs well under stress at scale. The service has solutions for deploying your model in whichever setting you need, so you can actually do something with all that data youve accumulated.

For the ultimate no-code experience, youll want to use a tool like Bubble. Bubble is one of the most popular solutions for creating programs without needing to code: You use an interface similar to something like Photoshop to build your app or service, dragging and dropping new UI elements and functions as necessary.

But while Bubble is a no-brainer for us code-illiterates to build things, its also deeply integrated with AI. There are tons of AI applications you can include in your programs using Bubble: You can connect your builds to OpenAI products like ChatGPT, GPT-3, DALLE-2, and Whisper, while at the same time taking advantage of plugins make by other Bubble members. All of these tools allow you to build a useful AI program by yourselfsomething that uses the power of GPT without needing to know how it works in the first.

One of the best ways to get started here is by taking advantage of OpenAI Playground. Playground is similar to ChatGPT, in that its based on OpenAIs large language models, but it isnt a chatbot. As such, you can use Playground to create different kinds of products and functions that you can then easily move to a Bubble project using the View Code button.

See original here:

Four Ways to Build AI Tools Without Knowing How to Code - Lifehacker