A comparative study of predicting the availability of power line … – Nature.com

Mlnek, P., Rusz, M., Benel, L., Slik, J. & Musil, P. Possibilities of broadband power line communications for smart home and smart building applications. Sensors 21(1), 240 (2021).

Article ADS PubMed PubMed Central Google Scholar

Al-Fuqaha, A., Guizani, M., Mohammadi, M., Aledhari, M. & Ayyash, M. Internet of things: A survey on enabling technologies, protocols, and applications. IEEE Commun. Surv. Tutor. 17(4), 23472376 (2015).

Article Google Scholar

Gonzlez-Ramos, J. et al. Upgrading the power grid functionalities with broadband power line communications: Basis, applications, current trends and challenges. Sensors 22(12), 4348. https://doi.org/10.3390/s22124348 (2022).

Article ADS PubMed PubMed Central Google Scholar

Ghasempour, A. Internet of things in smart grid: Architecture, applications, services, key technologies, and challenges. Inventions 4(1), 22 (2019).

Article Google Scholar

Hamamreh, J. M., Furqan, H. M. & Arslan, H. Classifications and applications of physical layer security techniques for confidentiality: A comprehensive survey. IEEE Commun. Surv. Tutor. 21(2), 17731828 (2018).

Article Google Scholar

Vincent, T. A., Gulsoy, B., Sansom, J. E. & Marco, J. Development of an in-vehicle power line communication network with in-situ instrumented smart cells. Transp. Eng. 6, 100098 (2021).

Article Google Scholar

Brandl, M. & Kellner, K. Performance evaluation of power-line communication systems for lin-bus based data transmission. Electronics 10(1), 85 (2021).

Article Google Scholar

Prasad, G. & Lampe, L. Full-duplex power line communications: Design and applications from multimedia to smart grid. IEEE Commun. Magaz. 58(2), 106112 (2019).

Article Google Scholar

Rocha Farias, L., Monteiro, L. F., Leme, M. O. & Stevan, S. L. Jr. Empirical analysis of the communication in industrial environment based on g3-power line communication and influences from electrical grid. Electronics 7(9), 194 (2018).

Article Google Scholar

Wang, B. & Cao, Z. A review of impedance matching techniques in power line communications. Electronics 8(9), 1022 (2019).

Article Google Scholar

Oliveira, R. M., Vieira, A. B., Latchman, H. A. & Ribeiro, M. V. Medium access control protocols for power line communication: A survey. IEEE Commun. Surv. Tutor. 21(1), 920939 (2018).

Article Google Scholar

Appasani, B. & Mohanta, D. K. A review on synchrophasor communication system: Communication technologies, standards and applications. Protect. Control Mod. Power Syst. 3(1), 117 (2018).

Google Scholar

Sanz, A., Sancho, D., & Ibar, J.C. Performances of g3 plc-rf hybrid communication systems. In: 2021 IEEE International Symposium on Power Line Communications and Its Applications (ISPLC), pp. 6772 (2021). IEEE

Deru, L., Dawans, S., Ocaa, M., Quoitin, B. & Bonaventure, O. Redundant border routers for mission-critical 6lowpan networks. In Real-world Wireless Sensor Networks (ed. Dev, T.) 195203 (Springer, 2014).

Chapter Google Scholar

Kassab, A.S., Seddik, K.G., Elezabi, A., & Soltan, A. Realistic wireless smart-meter network optimization using composite rpl metric. In: 2020 8th International Conference on Smart Grid (icSmartGrid), pp. 109114 (2020). IEEE

Stiri, S. et al. Hybrid plc and lorawan smart metering networks: Modeling and optimization. IEEE Trans. Indus. Inf. 18(3), 15721582 (2021).

Article Google Scholar

Ullah, Z., Al-Turjman, F., Mostarda, L. & Gagliardi, R. Applications of artificial intelligence and machine learning in smart cities. Comput. Commun. 154, 313323 (2020).

Article Google Scholar

Mata, J. et al. Artificial intelligence (ai) methods in optical networks: A comprehensive survey. Optic. Swit. Netw. 28, 4357 (2018).

Article Google Scholar

Fu, Y., Wang, S., Wang, C.-X., Hong, X. & McLaughlin, S. Artificial intelligence to manage network traffic of 5G wireless networks. IEEE Netw. 32(6), 5864 (2018).

Article Google Scholar

Yang, H. et al. Artificial-intelligence-enabled intelligent 6G networks. IEEE Netw. 34(6), 272280 (2020).

Article Google Scholar

Shi, Y., Yang, K., Jiang, T., Zhang, J. & Letaief, K. B. Communication-efficient edge AI: Algorithms and systems. IEEE Commun. Surv. Tutor. 22(4), 21672191 (2020).

Article Google Scholar

Zhang, C. & Lu, Y. Study on artificial intelligence: The state of the art and future prospects. J. Indus. Inf. Integr. 23, 100224. https://doi.org/10.1016/j.jii.2021.100224 (2021).

Article Google Scholar

Balada, C. et al. Fhler-im-netz: A smart grid and power line communication data set. IET Smart Gridhttps://doi.org/10.1049/stg2.12093 (2022).

Article Google Scholar

Righini, D., Tonello, A.M. Noise determinism in multi-conductor narrow band plc channels. In: 2018 IEEE International Symposium on Power Line Communications and its Applications (ISPLC) (2018) https://doi.org/10.1109/isplc.2018.8360239

Righini, D., Tonello, A.M.: Automatic clustering of noise in multi-conductor narrow band plc channels. In: 2019 IEEE International Symposium on Power Line Communications and its Applications (ISPLC) (2019) https://doi.org/10.1109/isplc.2019.8693272

Reyes, D. M. A., Souza, R. M. C. R. & Oliveira, A. L. I. A three-stage approach for modeling multiple time series applied to symbolic quartile data. Exp. Syst. Appl. 187, 115884. https://doi.org/10.1016/j.eswa.2021.115884 (2022).

Article Google Scholar

Bade, K., & Nurnberger, A. Personalized hierarchical clustering. In: 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI06) (2006) https://doi.org/10.1109/wi.2006.131

Leskovec, J., Rajaraman, A. & Ullman, J. D. Mining of Massive Datasets (Cambridge University Press, 2014).

Book Google Scholar

Vesanto, J. & Alhoniemi, E. Clustering of the self-organizing map. IEEE Trans. Neural Netw. 11(3), 586600. https://doi.org/10.1109/72.846731 (2000).

Article CAS PubMed Google Scholar

Dubey, A., Mallik, R. K. & Schober, R. Performance analysis of a multi-hop power line communication system over log-normal fading in presence of impulsive noise. IET Commun. 9(1), 19. https://doi.org/10.1049/iet-com.2014.0464 (2015).

Article Google Scholar

Hossam, M., Afify, A.A., Rady, M., Nabil, M., Moussa, K., Yousri, R., & Darweesh, M.S. A comparative study of different face shape classification techniques. In: 2021 International Conference on Electronic Engineering (ICEEM), pp. 16 (2021). https://doi.org/10.1109/ICEEM52022.2021.9480638

Prajapati, G.L., & Patle, A. On performing classification using svm with radial basis and polynomial kernel functions. In: 2010 3rd International Conference on Emerging Trends in Engineering and Technology (2010) https://doi.org/10.1109/icetet.2010.134

Almaiah, M. A. et al. Performance investigation of principal component analysis for intrusion detection system using different support vector machine kernels. Electronics 11(21), 3571 (2022).

Article Google Scholar

Verma, A.R., Singh, S.P., Mishra, R.C., & Katta, K. Performance analysis of speaker identification using gaussian mixture model and support vector machine. In: 2019 IEEE International WIE Conference on Electrical and Computer Engineering (WIECON-ECE) (2019) https://doi.org/10.1109/wiecon-ece48653.2019.9019970

Khan, M. Y. et al. Automated prediction of good dictionary examples (gdex): A comprehensive experiment with distant supervision, machine learning, and word embedding-based deep learning techniques. Complexityhttps://doi.org/10.1155/2021/2553199 (2021).

Article Google Scholar

Liu, P., Zhang, Y., Wu, H. & Fu, T. Optimization of edge-plc-based fault diagnosis with random forest in industrial internet of things. IEEE Internet Things J. 7(10), 96649674. https://doi.org/10.1109/jiot.2020.2994200 (2020).

Article Google Scholar

Bhushan, S. et al. An experimental analysis of various machine learning algorithms for hand gesture recognition. Electronics 11(6), 968. https://doi.org/10.3390/electronics11060968 (2022).

Article Google Scholar

Abirami, S. P., Kousalya, G. & Karthick, R. Varied expression analysis of children with ASD using multimodal deep learning technique. Deep Learn. Parallel Comput. Environ. Bioeng. Syst.https://doi.org/10.1016/b978-0-12-816718-2.00021-x (2019).

Article Google Scholar

Heydarian, M., Doyle, T. E. & Samavi, R. Mlcm: Multi-label confusion matrix. IEEE Access 10, 1908319095. https://doi.org/10.1109/access.2022.3151048 (2022).

Article Google Scholar

Abdulhammed, R., Musafer, H., Alessa, A., Faezipour, M. & Abuzneid, A. Features dimensionality reduction approaches for machine learning based network intrusion detection. Electronics 8(3), 322. https://doi.org/10.3390/electronics8030322 (2019).

Article Google Scholar

Read more here:

A comparative study of predicting the availability of power line ... - Nature.com

Preventing Bias In Machine Learning – Texas A&M Today – Texas A&M University Today

Based on data, machine learning can quickly and efficiently analyze large amounts of information to provide suggestions and help make decisions. For example, phones and computers expose us to machine learning technologies such as voice recognition, personalized shopping suggestions, targeted advertisements and email filtering.

Dr. Na Zou

Texas A&M Engineering

Machine learning impacts extensive applications across diverse sectors of the economy, including health care, public services, education and employment opportunities. However, it also brings challenges related to bias in the data it uses, potentially leading to discrimination against specific individuals or groups.

To combat this problem, Dr. Na Zou, an assistant professor in the Department of Engineering Technology and Industrial Distribution at Texas A&M University, aims to develop a data-centric fairness framework. To support her research, Zou received the National Science Foundations Faculty Early Career Development Program (CAREER) Award.

She will focus on developing a framework from different aspects of common data mining practices that can eliminate or reduce bias, promote data quality and improve modeling processes for machine learning.

Machine learning models are becoming pervasive in real-world applications and have been increasingly deployed in high-stakes decision-making processes, such as loan management, job applications and criminal justice, Zou said. Fair machine learning has the potential to reduce or eliminate bias from the decision-making process, avoid making unwarranted implicit associations or amplifying societal stereotypes about people.

According to Zou, fairness in machine learning refers to the methods or algorithms used to solve the phenomenon that machine learning algorithms naturally inherit or even amplify the bias in the data.

For example, in health care, fair machine learning can help reduce health disparities and improve health outcomes, Zou said. By avoiding biased decision making, medical diagnoses, treatment plans and resource allocations can be more equitable and effective for diverse patient populations.

Additionally, users of machine learning systems can enhance their experiences across various applications by mitigating bias. For instance, fair algorithms can incorporate individual preferences in recommendation systems or personalized services without perpetuating stereotypes or excluding certain groups.

To develop unbiased machine learning technologies, Zou will investigate data-centric algorithms capable of systemically modifying datasets to improve model performance. She will also look at theories that facilitate fairness through improving data quality, while incorporating insights from previous research in implicit fairness modeling.

The challenge of developing a fairness framework lies in problems within the original data used in machine learning technologies. In some instances, the data may lack quality, leading to missing values, incorrect labels and anomalies. In addition, when the trained algorithms are deployed in real-world systems, they usually face problems of deteriorated performance due to data distribution shifts, such as a covariate or concept shift. Although the data can be incomplete, it is used to make impactful decisions throughout various fields.

For example, the trained models on images from sketches and paintings may not achieve satisfactory performance when used in natural images or photos, Zou said. Thus, the data quality and distribution shift issues make detecting and mitigating models discriminative behavior much more difficult.

If successful, Zou believes the outcome of this project will lead to advances in facilitating fairness in computing. The project will produce effective and efficient algorithms to explore fair data characteristics from different perspectives and enhance generalizability and trust in the machine learning field. This research is expected to impact the broad utilization of machine learning algorithms in essential applications, enabling non-discrimination decision-making processes and prompting a more transparent platform for future information systems.

Receiving this award will help me achieve my short-term and long-term goals, Zou said. My short-term goal is to develop fair machine learning algorithms through mitigating fairness issues from computational challenges and broadening the impact through disseminating research outcomes and a comprehensive educational toolkit. The long-term goal is to extend the efforts to all aspects of society to deploy fairness-aware information systems and enhance society-level fair decision-making through intensive collaborations with industries.

View post:

Preventing Bias In Machine Learning - Texas A&M Today - Texas A&M University Today

Apple’s Commitment to Generative AI and Machine Learning – Fagen wasanni

In recent months, there has been a surge in the development of artificial intelligence (AI) chatbots, with ChatGPT making waves in the industry. Tech giants like Microsoft and Google have also announced their plans to integrate AI technology into their products. Amidst this flurry of activity, Apple has been relatively quiet, leading some to believe that generative AI technology is not a priority for the company, and that it may be falling behind its competitors.

However, those familiar with Apples approach know that the company does not make bold proclamations about its projects until it has a tangible product to showcase. Apple CEO Tim Cook addressed this misconception in an interview with Reuters following the companys quarterly earnings call. He emphasized that generative AI research has been a long-standing initiative for Apple, and the company has been investing billions of dollars in research and development, with a significant portion allocated to AI.

While Apple may not be as overt in its AI initiatives as its rivals, Cook pointed out that AI will be integrated into Apples products to enhance user experiences, rather than being offered as standalone AI devices like ChatGPT. For instance, he highlighted features such as Live Voicemail Transcription in the upcoming iOS 17 as examples of how AI will power new functionalities in Apple products.

Apple has been incorporating machine learning features into its devices for years, utilizing the Neural Engine in its A-series and M-series chips. The company has made significant advancements in areas like computational photography, voicemail transcription, visual lookup, language translation, and augmented reality. This progress is exemplified by Apples hiring of Googles former Head of AI, John Giannandrea, as its Senior Vice President of Machine Learning and Artificial Intelligence Strategy.

Undoubtedly, AI has the potential to greatly enhance Apples voice assistant, Siri. Despite starting earlier than competitors like Alexa and Google Assistant, Siri lost its early lead, but Apple has been working to rectify that in recent years. It is likely that Giannandrea has been tasked with bolstering Siris capabilities.

Cook emphasized that while Apple may not have the same breadth of AI-centric services as other companies, AI and machine learning are fundamental core technologies embedded in every Apple product. Apples commitment to generative AI and machine learning remains strong, as evidenced by its substantial investments in research and development, the incorporation of AI features into its products, and the strategic hiring of top AI talent.

Read the original here:

Apple's Commitment to Generative AI and Machine Learning - Fagen wasanni

Richmond could become AI and machine learning tech hub – The Daily Progress

Can Richmond be the capital of artificial intelligence?

A local group is pushing to turn the region into an innovation hub for artificial intelligence and machine learning in the coming years. Many of the companies and experts pushing the limits of these technologies could be based in the Richmond area should a grant be awarded to the group.

The recent emergence of AI tools like ChatGPT and Bard AI have been touted as a revolutionary leap in human technology, with the ability to impact nearly every field and the need for all companies to become fluent in AI.

The Richmond Technology Council branded as rvatech is a member-driven association of companies actively trying to grow Richmonds tech-based economy. The group is applying for a federal grant that worth between $50 to $70 million that would establish Richmond as one of about 20 tech hubs around the country.

The U.S. Economic Development Administrations Technology Hub Grant Programis targeting 10 areas of technology. Some are in fields like robotics, advanced computing and semiconductors, advanced communications, biotechnology and advanced energy, like nuclear. rvatech submitted an application specifically for AI and machine learning.

We're trying to position Richmond as the leading edge of artificial intelligence and machine learning so that if you're a company that is in that space, this is a good place to find talent and to headquarter here, said Nick Serfass, CEO of rvatech. If you want to enter the space, it's a good place to come and learn and be exposed to thought leaders and other practitioners who are in the space.

The application process for these grants is expected to be competitive with regions across the country keen on raising their profile in tech, and AI. By this fall, applicants will be narrowed to 20 regions.

It's transformative in terms of what it could do for a metropolitan area, Serfass said. We don't know of any other artificial intelligence and machine learning applications going out as of now.

The termsAI and machine learning are often used interchangeably, though machine learning is really a subcategory of AI. The field of AI essentially creates computers and robots that can both mimic and exceed human capabilities. It can be used to automate tasks without the need for human input or intake massive amounts of information and make decisions.

Machine learning is a pathway to artificial intelligence. It uses algorithms to recognize patterns in data that can make increasingly better decisions.

These tools are can be applied to fields like manufacturing, banking, health care and customer service. AI can recognize errors and malfunctions in equipment before they happen, or detect and prevent cybersecurity attacks. Everyday people are also using AI to do household tasks like planning workouts or meals, sending emails and making music playlists.

The bulk of the funding from the federal grant would go towards workforce and talent development through higher education, workforce programs at mid-career leadership levels or talent attraction, bringing in top professionals from other areas.

Nick Serfass is the executive director of the Richmond Technology Council, or RVATech.

More companies and workers in the space could later lend itself to more physical changes like lab and research facilities.

Serfass says the Richmond tech scene is well-positioned to be transformed into a hub. A report from commercial real estate and investment firm CBRE listed Richmond in the top 50 tech talent markets nationwide.

A high density of Fortune 500 companies are headquartered in the city and its surrounding counties. Many of those rely either entirely on tech, or have tech-focused sides of their businesses that would benefit from AI and machine learning.

Serfass also cited Richmonds status as the seat of state government as an asset, and Dominions presence in the area as an entity that could revolutionize infrastructure through the use of AI. There is also a major presence of data centers from Meta and QTS in Henrico's White Oak Technology Park, which are a critical asset to digital businesses.

Several different elements highlight the merit of the city and why it could be a great tech hub. Its really the fact that we have such a 360 degree set of resources and assets here in town that could help us thrive.

Richmond has also grown a startup fostering and acceleration scene, largely though Capital Ones Michael Wassmer Innovation Center in Shockoe Bottom. Programs like Startup Virginia and accelerator Lighthouse Labs have helped countless young companies grow, many with focuses in tech.

Richmond has a history of providing focused tech solutions, including data analysis, AI and machine learning in niche markets. As a tech-focused acceleration program, we are always on the lookout for startups utilizing these new technologies, and we've seen more and more apply each cycle, said Art Espy, managing director for Lighthouse Labs. We love it when a local or regional company is a fit for our program; building a hub here would give us even more homegrown tech startups to accelerate, while adding even more vibrancy to our thriving startup ecosystem.

rvatech is currently writing the mission statement for its grant application which could also include the need to bring underserved populations into the industry. Serfass said tech lends itself well to certifications in lieu of college degrees, which offers accessible entry into the field.

A second group in Richmond is pursuing a grant from the Tech Hubs Program. The Alliance for Building Better Medicine, which has an application in the area of advanced pharmaceutical manufacturing, has been a national leader in pharma development, seeking to onshore medicine making from overseas and create a more robust U.S. supply chain for medications.

The tech hubs initiative is an exciting opportunity for Greater Richmond and we are supporting not one, but two applications from our region this year, said Jennifer Wakefield, president and CEO of the Greater Richmond Partnership. Both community partners, rvatech and the Alliance to Build Better Medicine, see the promise of elevating Greater Richmond and its assets which greatly benefits economic development and business attraction to the area.

Sean Jones (804) 649-6911

sjones@timesdispatch.com

Twitter: @SeanJones_RTD

Get the latest local business news delivered FREE to your inbox weekly.

Read the rest here:

Richmond could become AI and machine learning tech hub - The Daily Progress

Platform Reduces Barriers Biologists Face In Accessing Machine … – Bio-IT World

August 1, 2023 | A group of scientists at the Wyss Institute for Biologically Inspired Engineering at Harvard University and MIT are convinced that automated machine learning (autoML) is going to revolutionize biology by removing many of the technical barriers to using computational models to answer fundamental questions about sequences of nucleic acids, peptides, and glycans. Machine learning can be complicated, but it doesnt have to be, and sometimes simpler is better, according to graduate student Jackie Valeri, a big believer in the power of autoML to solve real-world problems.

AutoML is a method learning concept that helps users transfer data to training algorithms and automatically search for the best ML architecture for a given issue, lowering the demand for expert-level computational knowledge that currently outpaces the supply. It can also be pretty competitive with even the best manually designed ML models that can take months if not years to develop, says Valeri, as she and her colleagues recently demonstrated in a paper published in Cell Systems (DOI: 10.1016/j.cels.2023.05.007).

The article showcased the potential of their novel BioAutoMATED platform which, unlike other autoML tools, accommodates more than one type of ML model and is designed to accept biological sequences. Its intended users are systems and synthetic biologists with little or no ML experience, says Valeri, who works in the lab of Jim Collins, Ph.D. at the Wyss Institute.

The all-in-one BioAutoMATED platform modifies three existing AutoML toolsAutoKeras, which searches for optimal neural networks; DeepSwarm, which looks for convolutional neural networks; and TPOT, which hunts for a variety of other, simpler modeling techniques such as linear regression and random forest classifiersto come up with the most appropriate model for a users dataset, she explains. Standardized output results are presented as a set of folders, each associated with one of those search techniques, revealing the best performing model in graphic and text file format.

The tool is very meta, says Valeri, in that it is learning on the learning. Model selection is often the part of research projects that requires a lot of computational expertise biologists generally do not possess and the task cant be easily passed to an ML specialist even if one is to be found because domain knowledge is needed in the model-building process.

Overall, biological researchers are excited about using machine learning but until now have been stymied by the amount of coding needed to get started, she says, noting that it is not uncommon for ML models to have a codebase of over 750 lines. The installation of packages alone can be a huge barrier.

Interest in ML has skyrocketed over the past year thanks largely to the introduction of ChatGPT with its user-friendly interface, but people have also quickly discovered they cant trust everything the large language model has to offer, says Valeri. Similarly, BioAutoMATED is useful but not a magic bullet that erases data problems and like ML in general should be approached with a healthy amount of skepticism to ensure it is learning whats intended.

BioAutoMATED will in the future likely be used together with ChatGPT, predicts Wyss postdoctoral fellow Luis Soenksen, Ph.D., co-lead author on the Cell Systems paper. Researchers will simply articulate what they want to do and be presented with the best questions, required data, and ML models to get the job done.

When put to the test, BioAutoMATED not only outperformed other autoML tools but also some of the models created by a professional ML expertand did it in under 30 minutes using only 10 lines of input code from the user. The required coding is for the basics, says Valeri, to specify the target folder for results, the file name where input data can be found, the column name where sequences can be found within that file, and run times for these extensions.

Users are instructed to first install Docker on their computer, if they have not done so already, and are walked through the process of doing that, she adds. The open software platform sets up its own environment for running applications, requiring only two lines of code to access the Jupyter notebooks preloaded on BioAutoMATED that contain everything needed to run the autoML tool. Its a quick start for most people accustomed to using a computer.

With a bit more coding, users can access some of the embedded extras, says Valeri. These include the outputs from scrambled control tests where BioAutoMATED generates sequences by shuffling the order of nucleotides, answering the frequently asked question of whether models are picking up on real order-and sequence-specific biology.

Half of the battle in biological research is knowing how to ask the right questions, says Soenksen. The platform helps users do that as well as provides insights leading to new questions, hypotheses, models, and experiments.

Users can also opt for data saturation tests where BioAutoMATED sequentially reduces the dataset size to see the effect on model performance, Valeri says. If you can say the models do great with 20,000 sequences, maybe you dont have to go to the effort of collecting 50,000 or 100,000 sequences, which is a real impactful finding for a biologist actually doing the experiments.

Two of the most exciting outputs from the tool, in Valeris mind, are the interpretation and design results. Interpretation results indicate what a model is learning (e.g., nucleotides of elevated importance), including sequence logos where the larger the size of the letter in the sequence the more important it is to whatever function of interest is being examined. Sequence logos of the raw data can also be done to facilitate comparisons across ML tools.

Biologists using BioAutoMATED in this way can expect some actionable outputs, says Valeri. They might want to pay more attention to a motif that pops up through all these sequence logos, for example, or do a deep mutational scanning of a targeted region of the sequence that appears to be most important.

The other key output is a list of de novo design sequences that are optimized for whatever function the model has been trained on, she says. For the newly published study, this focused on the downstream efficiency of a ribosome binding site to translate RNA into protein in E. coli bacteria.

BioAutoMATED was also used to identify areas of the sequence most important in determining translation efficiency, and to design new sequences that could be tested experimentally. Further, the platform generated highly accurate information about amino acids in a peptide sequence most critical in determining an antibodys ability to bind to the drug ranibizumab (Lucentis), as well as classified different types of glycans into immunogenic and non-immunogenic groups based on their sequences.

Finally, the team had the platform optimize the sequences of RNA-based toehold switches. This informed the design of new toehold switches for experimental testing with minimal input coding required.

The time it takes to obtain results from BioAutoMATED depends on several factors, including the question being asked and the size of the dataset for model training, says Valeri. Weve found the length of the sequence is a really big factor... and the compute resources you have available.

The maximum user-allowed time for obtaining results is another important consideration, adds Soenksen. The platform can search for hours or days, as circumstances dictate. Time constraints are routinely employed when training ML models as a matter of practicality.

Soenksen and Valeri both use BioAutoMATED as a benchmark for their own custom-built models, and friends that have tested the platform on different machines are enthusiastic about its potential, they say. In the manuscript, the platform also had good performance on many different datasets, including ones specific to sequence lengths and types.

I have personally used it for some quick paper explorations, trying to see what data are available... [without] having to take the time to code up my own machine learning models, says Valeri. Although it is too soon to know how the tool will be used by biologists elsewhere, it is already being used regularly by a handful of scientists at Harvard investigating short DNA, RNA, peptide, and glycan sequences.

BioAutoMATED is available to download fromGitHub. If we get a lot of traction [with it], and I think we will, our team will probably put more resources into the user interface, notes Soenksen, a serial entrepreneur in the science and technology space. The long-term goal is to make the tool usable by clicking buttons to further lower barriers to access.

If youre a machine learning expert, youll probably be able to beat the output of BioAutoMATED, adds Valeri. We are just trying to make it easy for people with limited machine learning expertise to [quickly] get to a pretty good model.

Complicated neural networks and big language models, which have a lot of parameters and require large amounts of data, are not always best, she says. The simple-model techniques identified by TPOT can be quite well suited to the often-limited datasets biologists have available and can perform as well as if not better than systems with more advanced ML architecture.

Continue reading here:

Platform Reduces Barriers Biologists Face In Accessing Machine ... - Bio-IT World

Postdoctoral Fellowship: Pathogenesis of High Consequence … – Global Biodefense

Aresearch opportunityiscurrently available with the U.S. Department of Agriculture (USDA), Agricultural Research Service (ARS)located in Frederick, Maryland.

The Agricultural Research Service (ARS) is the U.S. Department of Agricultures chief scientific in-house research agency with a mission to find solutions to agricultural problems that affect Americans every day from field to table. ARS will deliver cutting-edge, scientific tools and innovative solutions for American farmers, producers, industry, and communities to support the nourishment and well-being of all people; sustain our nations agroecosystems and natural resources; and ensure the economic competitiveness and excellence of our agriculture. The vision of the agency is to provide global leadership in agricultural discoveries through scientific excellence.

Research Project:We are seeking doctoral-level (Ph.D., M.D., D.V.M.) scientists passionate about researching zoonotic and emerging diseases that impact human and animal health. Under the guidance of a mentor, the Postdoctoral fellows will conduct advanced machine learning-based research to analyze histopathology slides from high-consequence viral infections, such as Crimean Congo Hemorrhagic Fever, Nipah, Hendra, Ebola, and Marburg viruses, with the ultimate goal of understanding mechanisms of pathogenesis.

Learning Objectives:Fellows will learn to build pipelines for training machine learning models and slide analysis. Fellows will gain expertise in histopathology, and machine learning using TensorFlow, U-net, and QuPath. Under the guidance of a mentor, fellows will be expected to develop a scientific project and publish in peer-reviewed publications. As a result of participating in this fellowship, participants will enhance. thier:

Projects will be jointly performed with Dr. C. Paul Morris at the Integrated Research Facility at Fort Detrick in Frederick Maryland. Fellows may be required to undergo background investigations to obtain access to facilities.

Mentor(s):The mentor(s) for thisopportunityis Lisa Hensley(lisa.hensley@usda.gov). If you have questions about the nature of the research, please contact the mentor(s).

Anticipated Appointment Start Date: 2023.Start date is flexible and willdepend on a variety offactors.

Appointment Length:The appointment will initially be for oneyear, butmay bereneweduponrecommendation of ARSand is contingent on the availability of funds.

Level of Participation:The appointment is full-time.

Participant Stipend:The participant will receive a monthly stipend commensurate with educational level and experience.

Citizenship Requirements:This opportunity is available to U.S. citizens, Lawful Permanent Residents (LPR), and foreign nationals. Non-U.S. citizen applicants should refer to theGuidelines for Non-U.S. Citizens Detailspage of the program website for information about the valid immigration statuses that are acceptable for program participation.

ORISE Information:This program, administered by ORAU through its contract with the U.S. Department of Energy (DOE) to manage the Oak Ridge Institute for Science and Education (ORISE), was established through an interagency agreement between DOE and ARS.Participants do not become employees of USDA, ARS, DOE or the program administrator, and there are no employment-related benefits.Proof of health insurance is required for participation in this program. Health insurance can be obtained through ORISE.

Questions:Please visit ourProgram Website. After reading, if you have additional questions about the application process,please emailORISE.ARS.Plains@orau.organd include the reference code for this opportunity.

Qualifications:

The qualified candidate shouldhave received a doctoral degree in one of the relevant fields or be currently pursuing the degree with completion before start of appointment. Degree must have been received within the past three years.

Preferred Skills:

Eligibility Requirements:

Degree: Doctoral Degree received within the last 36 months or currently pursuing. Disciplines: Communications and Graphics Design Computer, Information, and Data Sciences Life Health and Medical Sciences Mathematics and Statistics

Apply for USDA-ARS-P-2023-0162 by 22 Dec 2023.

View original post here:

Postdoctoral Fellowship: Pathogenesis of High Consequence ... - Global Biodefense

Johns Hopkins makes major investment in the power, promise of … – The Hub at Johns Hopkins

By Hub staff report

Johns Hopkins University today announced a major new investment in data science and the exploration of artificial intelligence, one that will significantly strengthen the university's capabilities to harness emerging applications, opportunities, and challenges presented by the explosion of available data and the rapid rise of accessible AI.

At the heart of this interdisciplinary endeavor will be a new data science and translation institute dedicated to the application, understanding, collection, and risks of data and the development of machine learning and artificial intelligence systems across a range of critical and emerging fields, from neuroscience and precision medicine to climate resilience and sustainability, public sector innovation, and the social sciences and humanities.

The institute will bring together world-class experts in artificial intelligence, machine learning, applied mathematics, computer engineering, and computer science to fuel data-driven discovery in support of research activities across the institution. In all, 80 new affiliated faculty will join JHU's Whiting School of Engineering to support the institute's pursuits, in addition to 30 new Bloomberg Distinguished Professors with substantial cross-disciplinary expertise to ensure the impact of the new institute is felt across the university.

Ron Daniels

President, Johns Hopkins University

The institute will be housed in a state-of-the-art facility on the Homewood campus that will be custom-built to leverage a significant investment in cutting-edge computational resources, advanced technologies, and technical expertise that will speed the translation of ideas into innovations. AI pioneer Rama Chellappa and KT Ramesh, senior adviser to the president for AI, will serve as interim co-directors of the institute while the university launches an international search for a permanent director.

"Data and artificial intelligence are shaping new horizons of academic research and critical inquiry with profound implications for fields and disciplines across nearly every facet of Johns Hopkins," JHU President Ron Daniels said. "I'm thrilled this new institute will harness our university's innate ethos of interdisciplinary collaboration and build upon our demonstrated capacity to deliver impactful research at the forefront of this critical age of technology."

The creation of a data science and translation institute, supported through institutional funds and philanthropic contributions, will represent the realization of one of the 10 goals identified in the university's new Ten for One strategic plan: to create the leading academic hub for data science and artificial intelligence to drive research and teaching in every corner of the university and magnify our impact in every corner of the world.

The 21st century is already being defined by an explosion of available data across an almost incomprehensible array of subject areas and domains, from wearables and autonomous systems, to genomics and localized climate monitoring. The International Data Corporation, a global leader in market intelligence, estimates that the total amount of digital data generated will grow more than fivefold in the next few years, from an estimated 33 trillion gigabytes of information in 2021 to 175 trillion gigabytes by 2025.

"It's not hyperbole to say that data and AI to help us make informed use of that information have vast potential to revolutionize critical areas of discovery and will increasingly shape nearly every aspect of the world we live in," said Ed Schlesinger, dean of the Whiting School of Engineering. "As one of the world's premier research institutions, and with our existing expertise in foundational fields at the Whiting School, Johns Hopkins is uniquely positioned to play a lead role in determining how these transformative technologies are developed and deployed now and in the future."

Johns Hopkins has met the moment with several data-driven initiatives and investments, building on long-standing expertise in data science and AI to launch the AI-X Foundry earlier this year. Created to explore the vast potential of human collaboration with artificial intelligence to transform medicine, public health, engineering, patient care, and other disciplines, the AI-X Foundry represents a critical first step toward the creation of a data science and translation institute.

Additional JHU programs that will contribute to the new institute include:

Johns Hopkins is also home to the renowned Applied Physics Laboratory, the nation's largest university-affiliated research center, which has for decades conducted leading-edge research in data science, artificial intelligence, and machine learning to help the U.S. address critical challenges.

But there remains significant untapped potential to use data, artificial intelligence, and machine learning to expand and enhance research and discovery in nearly every area of the university, particularly in fields where the power of data is only now being realized. As Johns Hopkins Bloomberg Distinguished Professor Alex Szalay, an astrophysicist and pioneering data scientist, has said: "The most impactful research universities of the future will be those with scholars who possess meaningful depth in data and another domain, and are equipped with the ability to bridge between these disciplines."

To that end, the new institute will be a hub for interdisciplinary data collaborations with experts in divisions across Johns Hopkins, with affiliated faculty, graduate students, and postdoctoral fellows working together to apply big data to pressing issues. Their work will be supported by the latest techniques and technologies and by experts in data translation, data visualization, and tech transfer, shortening the path from discovery to impact and fostering the development of future large-scale data projects that serve the public interest, such as the award-winning Johns Hopkins Coronavirus Resource Center.

"The Coronavirus Resource Center is just one example of the power of data science and translation and its capacity to guide lifesaving decisions," said Beth Blauer, associate vice provost for public sector innovation and data lead for the CRC. "Our ability to harness data and connect it not just to public policy and innovation but to guide the deeply personal decisions we make every day speaks to the magnitude of this investment and its potential impact. There is no other institution more poised than Johns Hopkins University to guide us."

Johns Hopkins will develop this new institute with a commitment to data transparency and accessibility, highlighting the need for trust and reproducibility across the research enterprise and making data available to inform policymakers and the public. The institute will support open data practices, adhering to standards and structures that will make the university's data easier to access, understand, consume, and repurpose.

Additionally, institute scholars will partner with faculty from across the institution in fields including bioethics, sociology, philosophy, and education to support multidisciplinary research that helps academia and industry alike understand the societal and ethical concerns posed by artificial intelligence, the power and limitations of these tools, and the role for, and character of, appropriate government policy and regulation.

"As both data and the tools for harnessing data have become widespread, artificial intelligence and data-driven technologies are accelerating advances that will shape academic and public life for the foreseeable future," said Stephen Gange, JHU's interim provost and senior vice president for academic affairs. "The investment will ensure Johns Hopkins remains on the forefront of research, policy development, and civic engagement."

Excerpt from:

Johns Hopkins makes major investment in the power, promise of ... - The Hub at Johns Hopkins

Predicting BRAFV600E mutations in papillary thyroid carcinoma … – Nature.com

Patients

A retrospective analysis was performed on PTC patients who had undergone preoperative thyroid US elastography, BRAFV600E mutation diagnosis, and surgery at Jiangsu University Affiliated People's Hospital and traditional Chinese medicine hospital of Nanjing Lishui District between January 2014 and 2021. The enrolling process is displayed in Fig.1. 138 PTCs of 138 patients (mean age, 41.6311.36 [range, 2565] years) were analyzed in this study. The patients were divided into BRAFV600E mutation-free group (n=75) and BRAFV600E mutation group (n=63). Using a stratified sample technique at a 7:3 ratio, all patients were randomly assigned to either the training group (n=96) or the validation group (n=42). The following criteria were required for inclusion: postoperative pathology indicated PTC; preoperative thyroid US elastography evaluation; related US images and diagnostic outcomes; maximum nodule diameter>5mm, and<5cm; and unilateral and single focal lesion. The exclusion criteria included a maximum nodule diameter of>5cm and indistinct US imaging of nodules caused by artifacts. The clinical details of the enrolled patients were documented, including age, sex, nodule diameter, nodule location, nodular echo, nodule boundary, nodule internal and peripheral blood flow, nodule elastic grading, calcification, CLNM, and BRAFV600E mutation results. The Jiangsu University Affiliated People's Hospital and the traditional Chinese medicine hospital of Nanjing Lishui District Ethics Committee approved this study. Because it was retrospective in nature, it did not require written informed consent.

Schematic diagram of the patient selection. PTC, papillary thyroid carcinoma.

There were two ultrasonic devices used: the Philips Q5 (both Healthcare, Eindhoven, Netherlands) and the GE LOGIC E20 (GE Medical Systems, American General) (L12-5 linear array probe, frequency: 1014MHz).

To acquire longitudinal and transverse images of the thyroid nodules, continuous longitudinal and transverse scanning was done while the patients were supine. Blood flow in and around the nodule, strain elastic grading of the nodule, calcification, and CLNM were all visible on the coexisting diagram, which also included the nodule diameter, location, echo, and boundary.

The cross-sectional image's position and size of the sampling frame were adjusted, and the strain elastic imaging mode was activated. With an ROI that was larger than the nodules (generally more than two times), the nodules were placed in the middle of the elastic imaging zone. Pressure was applied steadily (range 12mm, 12 times/s) while the probe was perpendicular to the nodule. When the linear strain hint graph (green spring) suggested stability, the freeze key was pressed to get an elastic image; the ROI's color changed (green indicated soft; red indicated hard), and the nodule's hardness was determined based on elasticity. The elastic image was graded according to the following criteria: one point equals a nodular area that alternates between red, green, and blue; two points equal nodules that are partially red and partially green (mostly green, area>90%); three points equal a nodule area that is primarily green, with surrounding tissues visible in red; four points equal a nodule area that is primarily red, with the red area>90%; and five points equal a nodule area that is completely covered in red.

One week prior to surgery, thyroid US exams were conducted. US image segmentation was done manually. Using the ITK-SNAP program (http://www.itksnap.org), the ROIs were manually drawn on each image (Fig.2). The grayscale images were used to create a sketch outline of the tumor regions in the elastography US images.

(A) Ultrasound conventional B-mode image of papillary thyroid carcinoma. (B) corresponding ultrasound elastography image, with the circle,labeled A indicating a lesion region and the circle labeled B indicating a reference area. (C) Corresponding image after region of interest (ROIs) segmentation step.

Radiomic features were extracted using PyRadiomics (https://github.com/Radiomics/pyradiomics). A total of 479 radiomic features were recovered from each ROI's elastography US images. Among those included were first-order Gray Level Co-occurrence Matrix (GLCM), Gray Level Run Length Matrix (GLRLM), Gray Level Size Zone Matrix (GLSZM), Gray Level Dependence Matrix (GLDM), and Neighbouring Gray Tone Difference Matrix (NGTDM) features, as well as features deduced from wavelet filter images containing first-order GLCM, GLRLM, GLSZM, GLDM, and NGTDM features.

The retrieved features were normalized using a standard scalar to reduce bias and overfitting in the study. The dataset was divided into training and validation cohorts. To make each characteristic substantially independent, the row spatial dimension of the feature matrix was reduced using the Pearson correlation coefficient (PCC). Every pair of features with a PCC of more than 0.80 was deemed redundant.

After PCC, recursive feature elimination (RFE) for feature selection was applied to the whole dataset using the Scikit-learn python module24 to choose representative features for the training cohort. During the RFE procedure, the following parameters were taken into consideration (cross-validation was set to stratifiedkfold with the number of splits being 10, the random state was set to 101, minimum features to select was set to 3, and accuracy was employed for the scoring.

The Support Vector Machine with the linear kernel (SVM_L), Support Vector Machine with radial basis function kernel (SVM_RBF), LogisticRegression (LR), Nave Bayes (NB), K-nearest Neighbors (KNN), and Linear Discriminant Analysis (LDA) classifiers were used to build the prediction models using the RFEs key features. All six algorithms were implemented using the Scikit-learn machine learning library24

The same feature sets were chosen and fed into the model during the validation process. Standard clinical statistics like the area under the curve (AUC), sensitivity, specificity, negative predictive value (NPV), positive predictive value (PPV), and accuracy (ACC) were used to evaluate the model's performance on the training and validation datasets.

Python (version 3.7, https://www.python.org/ Accessed 8 July 2021) and IBM SPSS Statistics (Monk Ar, New York, New York State, USA.) for Windows version 26.0 were used for statistical analyses. Pearson's chi-square and Fisher's exact tests were used to compare the differences in categorical characteristics. The independent sample t-test was used for continuous factors with normal distribution, whereas the MannWhitney U test was used for continuous factors without normal distribution.

A twosided P<0.05 indicated statistically significant differences. PyRadiomics (version 2.2.0, https://github.com/Radiomics/pyradiomics Accessed 10 August 2021) and scikitlearn version 1.224 were used to extract radiomic features and build the prediction models. Each prediction model's AUC, sensitivity, specificity, ACC, NPV, and PPV were calculated.

Medcalc Statistical Software was used to calculate the six models AUCs and evaluate the predictions. The DeLong method was used to compare the AUCs of the six machine learning classifiers. To create calibration curves, the sci-kit-learn version 1.224 was used. R software (version 3.6.1, https://www.r-project.org) was used to perform the decision curve analysis.

The study was conducted in accordance with the Declaration of Helsinki and approved by the Jiangsu University-Affiliated Peoples Hospital and traditional Chinese medicine hospital of Nanjing Lishui District Ethics Committee.

Patient consent was waived by the Jiangsu University-Affiliated Peoples Hospital and traditional Chinese medicine hospital of Nanjing Lishui District ethics committee due to the retrospective nature of the study.

Originally posted here:

Predicting BRAFV600E mutations in papillary thyroid carcinoma ... - Nature.com

Machine learning prediction and classification of behavioral … – Nature.com

2013 TSA cohort traits

The traits scored in the cohort represent measures of confidence/fear, quality of hunting related behaviors, and dog-trainer interaction characteristics19,20. The traits Chase/Retrieve, Physical Possession, and Independent Possession were measured in both the Airport Terminal and Environmental tests whereas five and seven other traits were specific to each test, respectively (Table 1). The Airport Terminal tests include the search for a scented towel placed in a mock terminal and observation of a dogs responsiveness to the handler. This represents the actual odor detection work expected of fully trained and deployed dogs. Because the tasks were consistent between the time periods, the Airport Terminal tests demonstrate improvements of the dogs with age. All trait scores except for Physical and Independent Possession increased over time, with the largest increase between the 6- and 9-month tests (Fig.1a). This may be due to puppies having increased possessiveness and lack of training at younger ages. The general improvement over time could be due to the increased age of the dogs or to the testing experience gained. Compared to accepted dogs, those eliminated from the program for behavioral reasons had lower mean scores across all traits.

(a) Radar plots of the mean scores for each of the traits for the airport terminal tests. (b) Radar plots of the mean scores for each of the traits in the environmental tests; M03=BX (gift shop), M06=Woodshop, M09=Airport Cargo, M12=Airport Terminal.

Environmental tests involved taking dogs on a walk, a search, and playing with toys in a noisy location that changed for each time point. The traits measured a variety of dog behaviors as they moved through the locations, and their performance while engaging with toys. Accepted dogs had both higher and more consistent scores across the tests (Fig.1b). The largest separation of scores between accepted dogs and those eliminated for behavior occurred at 6-months, at the Woodshop. That suggests this test and environment combination might best predict which dogs will be accepted into the training program. Among the traits that showed the greatest separation between the two outcomes were Physical and Independent Possession, and Confidence.

Three different classification Machine Learning algorithms were employed to predict acceptance based on their ability to handle binary classifiers: Logistic Regression, Support Vector Machines, and Random Forest. Data were split into training (70%) and testing (30%) datasets with equivalent ratios of success and behavioral elimination status as the parent dataset. Following training of the model, metrics were reported for the quality of the model as described in the Methods. Prediction of success for the Airport Terminal tests yielded consistently high accuracies between 70 and 87% (Table 2). The ability to predict successful dogs improved over time, with the best corresponding to 12-months based on F1 and AUC scores. Notably, this pattern occurred with an overall reduction in both the number of dogs and the ratio of successful to eliminated dogs (Supplemental Table 1). The top performance observed was for the Random Forest model at 12-months: accuracy of 87%, AUC of 0.68, and harmonic mean of recall and precision F1 of 0.92 and 0.53 for accepted and eliminated dogs, respectively. The Logistic Regression model performed marginally worse at 12-months. Taking the mean of the four time points for accuracy, AUC, and accepted and eliminated F1, Logistic Regression was slightly better than Random Forest for the first three elements and vice versa for the fourth. The Support Vector Machines model had uneven results largely due to poor recall for eliminated dogs (0.09 vs. 0.32 and 0.36 for the other models).

Prediction of success from the Environmental tests yielded worse and more variable results (Table 2). A contributing factor for the poorer performance may have been the smaller mean number of dogs with testing data compared to the Airport Terminal test (56% vs. 73% of the cohort). Overall, the Logistic Regression model was most effective at predicting success based on F1 and AUC scores. That model showed a pattern of improving performance with advancing months. At 12-months, accuracy was 80%, the AUC was 0.60, and F1 were 0.88 and 0.36 for accepted and eliminated dogs, respectively. The best scores, seen at 12-months, coincided with the lowest presence of dogs eliminated for behavioral reasons. Support Vector Machines had extremely low or zero F1 for eliminated dogs at all time points. All three models had their highest accuracy (0.820.84) and the highest or second highest F1 for accepted dogs (0.900.91) at 3-months. However, all three models had deficient performance in predicting elimination at 3-months (F10.10).

To maximize predictive performance, a forward sequential predictive analysis was employed with the combined data. This analysis combined data from both the Airport Terminal and Environmental at the 3-month timepoint and ran the three ML models, then added the 6-month timepoint and so on. The analysis was designed to use all available data to determine the earliest timepoint for prediction of a dogs success (Table 3). Overall, the combined datasets did not perform much better than the individual datasets when considering their F1 and AUC values. The only instances where the combined datasets performed slightly better were M03 RF over the Environmental M03, M03+M06+M09 LR over both Environmental and Airport Terminal M09, all data SVM over Airport Terminal M12, and all data LR over Environmental M12. The F1 and AUC scores for the instances where the combined sequential tests did not perform better showed that the ML models were worse at distinguishing successful and eliminated dogs when the datasets were combined.

Two feature selection methods were employed to identify the most important traits for predicting success at each time point: Principal Components Analysis (PCA) and Recursive Feature Elimination using Cross-Validation (RFECV). The PCA was performed on the trait data for each test and no separation was readily apparent between accepted and eliminated dogs in the plot of Principal Components 1 and 2 (PC1/2). Scree plots were generated to show the percent variance explained by each PC, and heatmaps of the top 2 PCs were generated to visualize the impact of the traits within those. Within the heatmaps, the top- or bottom-most traits were those that explained the most variance within the respective component. RFECV was used with Random Forest classification for each test with 250 replicates, identifying at least one feature per replicate. In addition, 2500 replicates of a Nave Bayes Classifier (NB) and Random Forest Model (RF) were generated to identify instances where RF performed better than a nave classification.

Scree plots of the Airport Terminal tests showed a steep drop at PC2, indicating most of the trait variance is explained by PC1. The variance explained by the top two PCs ranged from 55.2 to 58.2%. The heatmaps (Fig.2a) showed the PC1/2 vectors with the strongest effects were H1/2 at 3- and 6- months, and PP at 9- and 12-months, both of which appeared in the upper left quadrant (i.e., negative in PC1 and positive in PC2). Several traits showed temporal effects within PCs: (i) at 3-months, PC1 had lower H1 than H2 scores, but that reversed and its effect increased at the other time points; (ii) at 3- and 6-months, PC2 had positive signal for H1/2, but both became negative at 9- and 12-months; (iii) at 3-months, HG was negative, but that effect was absent at other time points; (iv) at 3- and 6- months, PC2 had negative signal for PP, but it changed to strongly positive at 9- and 12-months. When the RFECV was run on the same Airport Test data, a similar pattern of increasing number of selected traits with advancing time points was observed as in the PCA (Table 4). Like the PCA results, H2 was among the strongest at all time points except for the 6-month, although it first appeared among the replicates at 9-months. Means of the NB and RF models were compared (Supplemental Table 2) and showed the M06 and M12 results were the most promising for classification. This suggested that shared traits such as all possession traits (MP, IP, and PP) and the second hunt test (H2) are the most important in identifying successful dogs during these tests, however the distinct nature of the assessment in each time point does not allow for a longitudinal interpretation.

Principal Component Analysis (PCA) results for airport terminal (a) and environmental (b) tests. Each time point displays a heatmap displaying the relative amount of variance captured by each trait within the top 2 components.

The PCA results for the Environmental tests yielded scree plots that had a sharp drop at PC2 for all time points except 9-months (Fig.2b). The amount of variation explained by the top two components decreased with the increasing time points from 62.7 to 49.8. The heatmaps showed the PC1/2 vector with the strongest effect was for the toy possession trait IP, which appeared in the upper left quadrant at all time points (CR and PP had a similar effect at reduced magnitudes). Within PC observations included the following: (i) in PC1, Confidence and Initiative were negative at all time points, and (ii) in PC2, Concentration and Excitability were positive at 3-months, and increased at 6- and at 9- and 12-months. When the RFECV was run on the Environmental test scores (Table 4), all traits for both 9- and 12- months were represented in the results. At 3-months, only Confidence and Initiative were represented and at 6-months, only those and Responsiveness. Means of the NB and RF models were also compared (Supplemental Table 2) and demonstrated M03 and M12 were the most significant for classification. These tests correspond to the earliest test at the gift shop and the last test at an active airport terminal. Primary shared traits include confidence and initiative, with possession-related and concentration traits being most important at the latest time point.

Original post:

Machine learning prediction and classification of behavioral ... - Nature.com

Predictive Analytics And Machine Learning Market: A … – Fagen wasanni

The Predictive Analytics And Machine Learning Market is the subject of a new study by MarketsandResearch.biz. The report presents a detailed analysis of the market, including crucial determinants such as product portfolio and application description. It incorporates trends, restraints, drivers, and different opportunities that shape the markets future status.

The report provides vital details about the market flow and its forecasted performance during the period of 2022-2028. It also examines the study of raw materials, downstream demand, and present market dynamics. The report profiles frontline players in the Predictive Analytics And Machine Learning market, with a focus on their technology, application, and product types.

Key market features are highlighted in the report to provide a comprehensive view of the global market. These include revenue size, average regional price, capacity utilization rate, production rate, gross margins, consumption, import & export, demand & supply, cost bench-marking, market share, annualized growth rate, and periodic CAGR. The report also covers supply chain facets, economic factors, financial data particulars, and analysis of various acquisitions & mergers, as well as present and future growth opportunities and trends.

Major companies operating in the global market are profiled in the report, including Schneider Electric, SAS Institue Inc., MakinaRocks Co., Ltd., Globe Telecom,Inc., Qlik, RapidMiner, IBM, Alteryx, Alibaba Group, Huawei, Baidu, and 4Paradigm.

Market segmentation based on product type includes General AI and Decision AI. By end-users/application, the report covers segments such as Financial, Retail, Manufacture, Medical Treatment, Energy, and Internet.

The report further provides regional segmentation, focusing on current and projected demand for the market in North America, Europe, Asia-Pacific, South America, and the Middle East & Africa.

The report aims to estimate the market size for the global Predictive Analytics And Machine Learning market on a regional and global basis. It also identifies major segments in the market and evaluates their market shares and demand. The report includes a competitive scenario and major developments observed by key companies in the historic years.

In conclusion, the report provides comprehensive analysis for new entrants and existing competitors in the industry. It also delivers a detailed analysis of each application/product segment in the global market.

For customized reports that meet specific needs, clients can contact the sales team at MarketsandResearch.biz.

View original post here:

Predictive Analytics And Machine Learning Market: A ... - Fagen wasanni

Photonic Neural Networks: Revolutionizing Machine Learning and AI – Fagen wasanni

Researchers at Politecnico di Milano have made a significant breakthrough in the field of photonic neural networks. These networks, inspired by the human brain, have the potential to revolutionize machine learning and artificial intelligence systems.

Neural networks analyze data and learn from past experiences, but they are energy-intensive and costly to train. To overcome this obstacle, the researchers have developed photonic circuits that are highly energy-efficient. These circuits can be used to build photonic neural networks that utilize light to perform calculations quickly and efficiently.

The team at Politecnico di Milano has developed training strategies for these photonic neurons, similar to those used for conventional neural networks. This means that the photonic neural network can learn quickly and achieve precision comparable to traditional neural networks, but with considerable energy savings. The energy consumption of these networks grows much more slowly compared to traditional ones.

The researchers have created a photonic accelerator in the chip that allows calculations to be carried out very quickly and efficiently. Using a programmable grid of silicon interferometers, the calculation time is incredibly fast, equal to less than a billionth of a second.

The implications of this breakthrough extend beyond machine learning and AI. The photonic neural network can be used for a wide range of applications, such as graphics accelerators, mathematical coprocessors, data mining, cryptography, and even quantum computers. The technology has the potential to revolutionize these fields by providing high computational efficiency.

The energy efficiency, speed, and accuracy of photonic neural networks make them a powerful tool for industries seeking digital transformation and AI integrations. With this technology, businesses can approach machine learning and artificial intelligence in a more cost-effective and efficient manner. The future holds great potential for photonic neural networks to shape the development of artificial intelligence and quantum applications.

Original post:

Photonic Neural Networks: Revolutionizing Machine Learning and AI - Fagen wasanni

Growing Concerns Over Bias in Powerful AI and Machine Learning … – Fagen wasanni

The rise of powerful artificial intelligence (AI) and machine learning (ML) tools has sparked concern about the presence of bias in these technologies. Sam Altman, CEO of OpenAI, acknowledges that there will never be a universally unbiased version of AI. As these tools become more prevalent across industries, bias has become a critical topic for lawmakers. Some countries, like France, have even banned the use of AI tools in certain sectors to prevent the commercialization of tools that predict judicial decision-making patterns.

One major concern with AI tools is the potential for biases to undermine the neutrality of the legal system. The use of predictive analysis tools that process vast amounts of data can produce unsettlingly accurate results. This raises questions about justice when an AI tool predicts guilt or innocence based on the judge or magistrate handling the case, rendering individual guilt irrelevant.

The issue of bias extends beyond the legal system. Industries such as healthcare and finance are increasingly embracing AI technology. Pfizer, for example, experimented with IBM Watson to accelerate drug discovery efforts in oncology. While IBM Watson fell short of expectations, the emergence of more powerful AI tools has renewed excitement in the industry. However, biases introduced during the data collection and algorithm development processes can lead to inequitable outcomes in patient treatment or financial decision-making.

Biases can enter datasets through factors like sampling bias, confirmation bias, and historical bias. To address bias, Altman highlights the importance of representative and diverse datasets. The quality of data directly affects the potential for bias in AI models.

The responsibility for addressing bias falls on policymakers as AI continues to impact society and individual lives. The proliferation of AI systems holds a mirror to society, revealing uncomfortable truths that might necessitate ethical guidelines and frameworks to ensure fairness and accountability in the use of AI technology.

More here:

Growing Concerns Over Bias in Powerful AI and Machine Learning ... - Fagen wasanni

Portrait of intense communications within microfluidic neural … – Nature.com

Construction of in vitro neural networks (NN)

The topology for the microfluidic NNs was designed as a dual-compartment architecture separated by microchannels and a middle chamber, as described in Fig.1a and b. The microfluidic design includes large channels (teal area) on both sides of the microfluidic circuit, which are for seeding somas. Physical barriers prevent the somas from migrating outside these large chambers. However, the 5-m-tall microchannels and a middle chamber (red area) enable neurites to spread and connect the fluidic compartments along defined pathways. Because of the enhanced growth kinetics of the axons, long, straight microchannels (>500m in length) are expected to favor them and to prevent dendrites from connecting distant populations.

Figure1c illustrates the possible neurite guidance and connection schemes. From left to right, the first and shortest microchannels should favor neurite outgrowth from the somatic to the synaptic chamber. From there, dendrites are expected to spread over this 3-mm-wide middle chamber, while the axons, in contrast, may grow straight ahead toward the opposite channels or turn back toward the somatic chamber. At one entrance of the long axon microchannels, short dead-end microchannels should prevent an axonal closed loop, which would lock axons into the long microchannel. Those traps should guide the axons toward the short microchannel and the somatic chamber. The last schematic illustrates a simple, inexhaustive list of examples of connectivity that may result from these guiding rules in the cases of one or two nodes located in a somatic chamber. Active and hidden nodes (blue and gray circles, respectively) can both be involved.

The microfluidic circuits are then assembled with electronic chips on which microelectrode arrays are accurately aligned with the fluidic compartments and microchannels (Fig.2). Thus, several recording devices can efficiently track spike propagation within the neurites while simultaneously monitoring soma activation.

Optical and fluorescent micrographs of random and microfluidic networks showing the homogeneous distribution of somas within the random area of both control (a) and microfluidic (b,c) samples and the wide exploration of neurites within all fluidic compartments, including the somatic chamber (c), the microchannels and the synaptic chamber (df). Immunofluorescence staining was performed after 14days in culture. DAPI, anti-synapsin, and anti-tubulin (YL1/2) were chosen as markers for labeling the cell nuclei, synapses and cytoskeleton, respectively.

For both growth conditions, primary cells extracted from hippocampal neurons were seeded on poly-l-lysine-coated microelectrode arrays and cultured in glial-conditioned media (same culture for both conditions). Thus, the substrate properties and culture conditions remained the same for the two batches of samples (details in Materials and methods). In the somatic chamber, neurons were well dispersed, and neurites homogeneously covered the underlying substrate surface, forming a highly entangled mesh (Fig.2b). Additionally, the synaptic chamber was widely explored by the neurites (Fig.2d), confirming their efficient spreading within the short microchannels as well as the efficient filtering of somas (Fig.2e). Figure2f gives a closer view of the junction with the synaptic chamber. The intricate entanglement of neurites and their proximity within the microchannels is expected to reinforce the neurite coupling efficiency and the networks modularity. These first results assessed the healthy and efficient outgrowth of neurons in the microfluidic compartments, which succeeded to provide the expected network structure, mainly by keeping the soma and neurite compartments in the desired location.

Figure3 shows the representative activity recorded within the random and organized networks on Day 6 in vitro (DIV6). As clearly observed, the number of active electrodes and the spike rate are significantly higher in the organized microfluidic NN (Fig.3a and b). Additionally, the number of isolated spikes as opposed to burst events was higher than that in controls (Fig.3c). Thus, the modularity of microfluidic NNs appears enhanced within the microfluidic network (dual-compartment configuration shown in Fig. S1).

Activity patterns of random and organized NNs. Comparison of the neuronal activity of cultured hippocampal neurons cultured in random configuration (left column) and on a microfluidic chip (right column). Recordings were acquired 6days after seeding (6days in vitro). (a) Typical 50s time course of one recording channel of the MEA within the control random sample (left) and inside an axonal microchannel (right). (b) Raster plots of all events crossing the negative threshold of 5 mean absolute deviations for the 64 recording channels of the MEA in the control and microfluidic conditions (left and right resp.). Red dots highlight examples of collective bursts. (c) Evolution of neural activity during the culture time for random (blue) and organized (red) NNs, in terms of the following (from left to right): mean spike rate per active electrode (min 0.1Hz mean firing rate), number of active electrodes, mean burst rate and burst duration. The mean spike and burst rates are extracted from the voltage traces for each recording channel and averaged among all active electrodes (60 electrodes total, same culture for all conditions). Statistical significance ***p<0.001 (Students t test).

Note that the electrodes located within the microchannels are expected to have a high sealing resistance because the channel cross section is small and filled with cellular material. As a result, the detection efficiency of such electrodes is believed to be increased compared to that of their synaptic and somatic chamber counterparts44. This effect related only to the measurement condition could artificially increase the activity level observed in microfluidic NNs. However, the spiking rate measured in the synaptic chamber did not follow that trend. While this compartment was similar to the somatic chamber in terms of growth conditions, the spiking rate was significantly higher, being rather comparable to that of the microchannels. Thus, the recording conditions could not explain the higher electrical activity. The electrical activity was enhanced independently of the MEAs detection efficiency, revealing the impact of the NN structure on the cell activity and the discrepancy in the spiking dynamics of the soma and the neurite.

The mid-term evolution of the electrical activity remained the same for both conditions, with all electrophysiological features globally increasing over time up to Day 15 (Fig.3c). Interestingly, the maximal number of active electrodes was reached earlier for the confined microfluidic NN (i.e. 4 days earlier than for the open NN, Fig. S2). Additionally, the number of active electrodes was significantly higher, in agreement with the raster plots (Fig.3b). Thus, more electrodes were active, and their activation occurred earlier in cell development. The confinement and geometrical constraints of the microfluidic environment reinforce the establishment of electrical activity, which agrees with the accelerated maturation of neuronal cells previously observed by immunohistochemistry within a similar microfluidic chip24.

The evolution of the burst rate followed a similar trend, increasing up to Day 14. Values ranged from 24 to 34Hz for the microfluidic networks, greatly exceeding the bursting rate of random NN (10 times higher). The burst duration was, however, similar for control and microfluidic networks, slightly increasing with the culture time (from 50 to 250ms) and as expected for hippocampal neurons4, confirming the reliability of the microfluidic NNs.

Neurite compartments exhibited dense activity patterns compared to the somatic chamber, with the highest spiking rates being located within the proximal compartments that were the closest to the somatic chamber (Fig.4). Within these short microchannels, spike patterns were characterized by the highest spike amplitude and shape variability. This variability remained within the synaptic chamber, but spike amplitudes were lowered. In those short and synaptic compartments, both dendrites and axons can be expected. However, in the distal and long microchannels, spike amplitude and shape were almost perfectly constant, which is as expected for action potentials carried by axons. These discrepancies were observed under the same growth conditions, all within the microchannels, and stem from the physiological properties of neurites.

Spike forms acquired in each microfluidic compartment. Data are sourced from the same recording at DIV 11, with the 50s time trace on the left and the superposed cutouts extracted by a spike sorting algorithm (detailed in methods). From top to bottom, the figure shows the typical voltage time trace and spike forms within the long and distant axonal microchannel; the synaptic middle chamber (without somas); the short neurite (dendrites and axons) microchannels; and the somatic chamber.

Interestingly, the activity in the somatic chamber resembled that of the control samples in terms of spike shape and spike rate (Fig.3a). When the activity within the somatic chamber was isolated, the spiking rate closely followed the trend observed in control samples, ranging from 0.9 to 2.5Hz from 6 to 11days (Fig. S2), which is a typical value for hippocampal neurons. Thus, the areas containing the soma (within the random and organized NNs, respectively) exhibited comparable spike patterns regardless of the growth condition (opened or confined). Previous works reported similar differences between somatic and axonal spikes (without the microfluidic environment)42, which agrees with our observations and further highlights the physiological relevance of the observations. Here, the microchannels provided a unique way to identify and study neurite activity in proximal and distant areas, presumably corresponding to dendrites and axons, respectively.

The cross-correlation (CC) analysis (Fig.5) provided a functional cartography of the random and organized networks at several stages of their development (detailed in materials and methods, and see Fig. S3 for the dual-somatic chamber). For the control sample, correlations became significant at DIV11 between electrode clusters randomly dispersed over the whole sample (Fig.5a). Their amplitude was weak but remained constant over the network. In contrast, cross-correlations were spatially defined and more intense in term of amplitude and number within the organized networks (Fig.5b), also emerging earlier at DIV5.

Correlations within random and organized NNs. The cross-correlation matrix (CCM) was extracted from the 60 recording channels of the MEAs during the culture time (one electrode per line and per column; bin size<5ms). From top to bottom: CCM obtained at DIV11 and DIV14 for the control sample (left) and at DIV6 and DIV11 for the microfluidic sample (right). (Bottom right) Schematics illustrate the position of the recording channels within the microfluidic compartments. The bottom colored bar is then used in the (xy) axes of the CC maps to highlight the position of each microelectrode: (filled, teal) in the large chamber containing all the soma, (filled, red) in the microchannels and the synaptic chamber, and (open, teal) in the empty large chamber for axon outputs only (no soma).

Maximal values were found within the long and distal microchannels, with mean correlation coefficients close to 1 and 0.5, respectively. Indeed, strong correlations can be expected when measuring spike propagation within the axonal compartment, which is more highlighted within the distal and long microchannels.

Somatic signals were correlated with some electrodes located in the microchannels and the synaptic chamber, revealing long-range synchrony as well (Fig.5b). Their amplitudes increased with time (Fig.5d), revealing a reinforcement of network synchrony and connectivity, especially between the microchannels and the synaptic and somatic chambers. They were concomitant with a modulation of short-range correlations, which became higher between neighboring electrodes. This effect could have several origins, such as time selection of the master node and a reinforcement of selected connections. Additionally, it could result from inhibitory activity, glutamatergic and GABAergic neurons being expected in similar proportions in our culture, and their maturation could explain the appearance of silent electrodes at the final stage of electrical maturation.

Thus, groups of spatially confined electrodes revealed a synchronization of the subpopulation consistent with the geometrical constraints. Somatic and synaptic chambers and neurite microchannels exhibited specific spiking patterns (Figs.3 and 4) and correlation landscapes (Fig.5) that enabled the identification of each network compartment. In that way, microfluidic circuits are capable of inducing significant differences in the spatiotemporal dynamics of in vitro neural networks.

The short-term cross-correlations between each microelectrode were then assessed to track signal propagation between each compartment (Fig.6). Figure6a first assesses the connectivity of the somatic chamber. The main feature was that there were higher correlation and synchrony levels between soma and neurite than between somas. Most of the correlations occurred with the proximal microchannels. This explains the synchrony and correlation between proximal neurites (Fig.6b, purple column). The analysis also reveals long-range correlations with both the synaptic chamber and the axonal microchannels (orange and yellow columns). Thus, somatic signals efficiently activated the emission of spikes within distant axonal microchannels (up to a few mm).

Immediate correlation of spike trains within the organized NN. Mapping of short-term correlations (signal delay is2.5ms max) extracted from the MEA recordings of 11-day-old microfluidic NNs. Arrows represent a significant correlation between the 5ms-binned spike trains of two electrodes. The maximum delay between correlated electrodes is2.5ms. The four panels (ad) distinguish the interactions between (a) somas and neurites (blue arrow) and (bd) along neurites. (b) Correlation between the electrodes of the same MEA column but within different microchannels (purple arrows), showing backward and forward propagation between adjacent neurite channels or synchrony between proximal neurites resulting from the same excitation. (c) Correlation between electrodes of the same MEA line, thus within the same or aligned microchannels (green arrows), showing straight spike propagation; (d) Correlation between each electrode located within the microchannels and the synaptic chamber (red arrows), showing entangled neurite-neurite interactions. Straight correlations (green arrows, in Panel (c) are excluded.

Between different microchannels (Fig.6b, purple arrow), the correlations appeared strongest in the synaptic chamber (n=3.9 per electrode, orange column), where there was no physical barrier to restrict communication between neurites. Then, the correlation within different microchannels (purple, yellow, red columns) could reveal backward and forward propagation between adjacent neurite channels or synchrony resulting from the same excitation. This could stem, respectively, from closed loops of neurites (Fig.1c) or the proximity between microchannels and the somatic or synaptic chambers. The number of these correlations was higher for proximal microchannels, both in terms of number and length of correlation, up to electrodes separated by 5 pitches (n+5). If we consider the neural architecture as we designed it, this would suggest a higher level of connectivity for the dendrites and proximal axons (both present within the short microchannels) than for the distant axon (long microchannel). Further studies should assess this point with immunostaining to identify dendrites and axons and excitatory and inhibitory neurons, for instance. In fact, we must not neglect other possibilities, such as the impact of dendritic signals (e.g. EPSPs and IPSPs from inhibitory and excitatory neurons), which may hide activity within distant microchannels.

Figure6c shows straight propagation along aligned microchannels (green arrow) and presumably along the same or connected neurites. Again, more signals propagated to the left than to the right side of the synaptic chamber, which agrees with the expected position of the dendrites and axons and the filtering effect of the synaptic chamber. These propagations were dominated by short-distance correlations, essentially between neighboring electrodes (n+1 or n+2). Long-range interactions were, however, clearly distinguished between misaligned electrodes (Fig.6d, red arrow), with each active site being correlated on average with three distant (>n+1) electrodes and one neighboring (n+1) electrode. The spatial range of the correlation reached several millimeters (up to n+6). Generally, those panels show that straight propagation involved axonal channels, while propagation between dendrites and within the synaptic chamber was more spatially distributed, which is indeed as expected for hippocampal neurons. The design architecture of the microfluidic NN is functionally relevant.

The directionality of neural communications was then assessed by picturing the delayed cross-correlations (between 5 and 25ms). Thus, the correlated spike trains were expected to share a similar origin. We assume that a positive delay between correlated electrodes (A and B) indicates the direction of propagation (from A to B), regardless of the propagation pathway (possibly indirect with hidden nodes). Under this assumption, most of the short-range correlations observed previously were suppressed, while long-range correlations are numerous despite the distance between electrodes and the background noise (Fig.7).

Long-term correlation of spike trains within organized NN. Mapping of delayed correlations (signal delay is25ms max) extracted from the MEA recordings of 11-day-old microfluidic NNs. Arrows represent significant correlation with a delay between25ms and 25ms between 5-ms-binned spike trains of two electrodes. Short-term correlations with a delay less than 5ms are excluded. The four panels (ad) distinguish the interactions between (a) somas and neurites (blue arrow) and (bd) along neurites. The same representation as in Fig.6 is used for the purple, green and red arrows.

The temporality of events was clear within aligned microchannels (Fig.7c). Signals propagated from the short to the long microchannels toward the axons and seemed to originate from the somatic chamber (Fig.7a). Additionally, the same somatic electrode seemed to activate several neurite channels, which could explain the correlation observed between those microchannels (Fig.7b). Within adjacent and parallel microchannels (Fig.7b), signals could be carried by the same neurites (in a closed loop configuration), but the delay (525ms) suggests indirect communications, presumably by dendrites. As illustrated in Fig.7d, communications were highly intricate between short and long channels, which confirms efficient neurite mixing within the synaptic chamber. The directionality was also mitigated, as 50% of propagations occurred in both directions for the purple and red columns (short and long microchannels). This dual directionality agrees with the emergence of both input and output nodes in the same somatic chamber (greenandblue columns Fig.7a). For that reason, we can barely distinguish backpropagation events, if any, and their impact on signal processing within such microfluidic circuits.

Interestingly, we observed only one efferent node and few (34) afferents (output and input nodes, respectively) for both conditions within organized and random NNs (Fig.7a and Fig. S4, respectively). However, the number of correlated spike trains was significantly reduced in control cultures of the same age, which confirms intense activity underlying the accelerated maturation within the microfluidic environments. The microchannels are shown to enhance the detection efficiency and amplitude of recorded signals. However, high levels of activity and synchrony were also observed in the wider synaptic chamber, which excludes an isolated effect of the enhanced detection efficiency within the microchannels. Differences in encoding properties between random and organized NNs are thus demonstrated, leveraging a high level of connectivity. While somas and neurites could be isolated, this analysis indeed underlines the complexity of neural communications and the rich encoding possibility even within a basic one-node architecture.

View original post here:

Portrait of intense communications within microfluidic neural ... - Nature.com

New Optical Neural Network Filters Info before Processing – RTInsights

The system is similar to how human vision works by discarding irrelevant or redundant information, allowing the ONN to quickly sort out important information.

Cornell University researchers have developed an optical neural network (ONN) that can significantly reduce the size and processing time of image sensors. By filtering out irrelevant information before a camera detects the visual image, the ONN pre-processor can achieve compression ratios of up to 800-to-1, equivalent to compressing a 1,600-pixel input to just four pixels. This is one step closer to replicating the efficiency of human sight.

The ONN works by processing light through a series of matrix-vector multiplications to compress data to the minimum size needed. The system is similar to how human vision works by discarding irrelevant or redundant information, allowing the ONN to quickly sort out important information, yielding a compressed representation of the original data. The ONN also offers potential energy savings over traditional digital systems, which save images and then send them to a digital electronic processor that extracts information.

The researchers tested the optical neural network image sensor with machine-vision benchmarks, used it to classify cell images in flow cytometers, and demonstrated its ability to measure and identify objects in 3D scenes. They also tested reconstructing the original image using the data generated by ONN encoders that were trained only to classify the image. Although not perfect, this was an exciting result, as it suggests that with better training and improved models, the ONN could yield more accurate results.

Their work was presented in a paper titled Image Sensing with Multilayer, Nonlinear Optical Neural Networks, published in Nature Photonics.

See also: Using Photonic Neurons to Improve Neural Networks

ONNs have potential in situations where low-power sensing or computing is needed, such as in image sensing on satellites, where devices that use very little power are required. In such scenarios, the ability of ONNs to compress spatial information can be combined with the ability of event cameras to compress temporal information, as the latter is only triggered when the input signal changes.

Read this article:

New Optical Neural Network Filters Info before Processing - RTInsights

Tuning and Optimizing Your Neural Network | by Aye Kbra … – DataDrivenInvestor

A guide to tuning and optimizing neural networks. 13 min read

Table of Contents 1. Understanding the Basics of Neural Networks 2. Importance of Tuning and Optimizing Neural Networks 3. Training, Validation, and Test Sets: An Overview 4. Hyperparameters and Their Role in Neural Networks 5. Tuning Neural Network Hyperparameters 6. Strategies for Efficient Hyperparameter Tuning 7. Regularization Techniques for Avoiding Overfitting 8. Optimizing Neural Networks with Backpropagation 9. Advanced Optimization Techniques 10. Utilizing Hardware for Network Optimization 11. Debugging Neural Networks 12. Staying Current with Neural Network Optimization Trends

The first step in tuning and optimizing your neural network is to understand the basic principles of how neural networks work. In this section, well delve into the foundational concepts, including neurons, layers, and weights.

A neuron in a neural network is a mathematical function that collects and classifies information according to a specific architecture. The neuron takes in inputs, multiplies these by their respective weights, and passes them into an activation function to produce an output.

The layers of a neural network consist of an input layer, hidden layers, and an output layer. The input layer receives raw input while the output layer makes final decisions or predictions. Hidden layers fine-tune the input data.

Weights are the crux of the neural network as they adjust during the training process, helping your network learn from the errors it makes.

Originally posted here:

Tuning and Optimizing Your Neural Network | by Aye Kbra ... - DataDrivenInvestor

Simulation analysis of visual perception model based on pulse … – Nature.com

Neural network dynamics

The channels for each pulse element to receive external stimulus input in PCNN include feedback input channels and connection input channels. Moreover, the internal active item U of the pulse element is modulated by the nonlinear multiplication of the inverse feed input item F and the connection input item. U stands for nonlinear modulation matrix.Whether the pulse is issued in PCNN is related to the internal activity item U and threshold E of the neuron. Each pulse coupling kernel has a size, and the size of the six pulse coupling kernels in layer C1 is 55. The function f represents the pixel value of the coupled pulse image.The pulse coupling kernel is used to slide on the input data f(i, j) according to a fixed step size u(i) to make the pulse coupling kernel calculate the pulse coupling on the local data f(i).

$$ frac{1}{1 - n}sum {frac{f(i,j) - u(i)}{{f(m) - f(n)}} < n} $$

(1)

$$ 1 - |x| > frac{1}{1 - n}ln |x - f(j - 1)| $$

(2)

In the process of sparse decomposition 1-|x|, the high-frequency coefficient of multi-scale decomposition represents the detailed information such as region boundary and edge of multi-source image, and the human visual system is sensitive to the detailed information such as edge. How to construct high frequency coefficient perception strategy and extract significant high frequency coefficient is very important to improve the quality of perception image. Combined with the characteristics of high frequency component of source image w(s, t), image quality evaluation factor p(x, y) is considered to construct perception strategy.

$$ w(s,t) - w(s,0) = w(s - 1,t - 1) $$

(3)

$$ sum {p(x,y) - p} (x - x^{2} ) < p[n - 1] $$

(4)

In PCNN network, each pixel in the image is equivalent to an impulse element. At this point, the threshold E increases rapidly through the feedback input, causing the pulse element to stop transmitting pulses. The threshold k(x)/k(y) begins to decay over time, and when it is again smaller than the internal active term, the pulse element fires again, and so on.

$$ sum {k(x)} /k(y) < log (x - x^{2} - y - 1) $$

(5)

The algorithm first performs variance-based enhancement on color images, then uses the pulse-coupled neural network with spatial adjacency and similar brightness feature clustering, locates the noise points by comparing the difference between the ignition times of different image pixels, and finally follows the rules similar to the vector median filtering algorithm. Since each pixel will calculate the similarity with multiple seed points, the seed point that is most similar to the pixel point, that is, the corresponding minimum distance, is taken as the clustering center, and then the number of the seed point is given on the pixel point. Finally, the color value and coordinate value of the seed point and all pixel points are added and averaged to obtain the new cluster center in Fig.1.

Neural network clustering sample fusion.

The registered right and left focus samples were fused. Effective fusion results should result in a clear left and right image, that is, restore the contrast and sharpness of the respective mode paste areas in the two images. In order to make it as consistent as possible with the physical standard graph, we choose the correlation coefficient between the perceptual result and the physical standard graph as one of the measurement indexes. In addition, the definition of the average gradient balanced image, the scale of the standard deviation balanced image and the information degree of the entropy balanced image are discussed. When the pulse coupling kernel slides to the entire input data, only local data is extracted each time for feature calculation, which reflects the local connectivity of PCNN and greatly speeds up the calculation speed. In the sliding process, the parameters of each pulse coupling core remain unchanged, which means that each pulse coupling core only observes the features it wants to obtain through its own parameters, which greatly reduces the number of parameters and reflects the parameter sharing property of PCNN.

Based on the chaotic sequence and cyclic/block diagonal splitting structure of homomorphic filtering, aiming at the problem of poor reconstruction performance and high computational complexity, this paper proposes a deterministic measurement matrix optimization strategy based on modified gradient descent to minimize the correlation between observation matrix and projection matrix. Then the point (x, y) belongs to the foreground, otherwise belongs to the background. Compared with single threshold segmentation miu(r, g, b), double threshold segmentation can effectively reduce misjudgment.

$$ miu(r,g,b) = sqrt {(miu.exp (r,g) - miu.log (r,b)) - 1} $$

(6)

$$ log (i + j) - log (i - j) - 1 < i - j $$

(7)

Since the point cloud data log(i+j) has no clear connection relationship, the two-sided filtering algorithm can not be directly applied to the point cloud surface denoising. Bilateral filtering algorithm mainly involves point V. In this paper, the method is used to calculate the adjacent points of discrete point V, and the normal calculation of the vertex is obtained by optimizing a secondary energy term of the adjacent points.The essence of visual perception is that visual perception is divided into several regions according to some similarity principles, so the quality of segmented images can be judged by using the uniformity in each region. Therefore, the optimal segmentation result can be identified by calculating the 1/(1i) value of the binary image, so as to realize the automatic selection of the optimal segmentation result exp(1/d).

$$ frac{1 - i}{i}Z(i - j - k) = frac{1}{1 - i} + frac{1}{1 - j} + frac{1}{1 - k} + 1 $$

(8)

$$ exp ( - frac{miu(x + y - 1)}{{2d}})/exp ( - frac{x + y}{d}) < 1 $$

(9)

Coupling connection miu(x+y-1)/d refers to the operation mechanism of PCNN when the connection strength coefficient is not equal to 0. In this case, the element not only receives external excitation, but also receives feedback input information of the neighborhood pulse element. In this case, each pulse element in the model is coupled to each other. In the case of coupling connection, using coupling connection input L to regulate feedback input F is the key to communication between pulse elements in the coupled PCNN model.

$$ sum {|x + p(x - 1)|} sum {|x - p(x - 1)|} in w(x,t) $$

(10)

In the clipping method, the boundary p(x-1) of one grid is used to cut another grid in the overlapping area w(x, t), and then a new triangle is generated on the common boundary to make the two grids join together. This method will produce a large number of small triangles at the common boundary due to clipping. Moreover, this method only uses the vertices in one mesh in the overlapping region, and the vertices in the other mesh are completely abandoned. For the mesh with large overlapping region, the overlapping region of the two grids cannot be used to correct the vertices. At the same time, due to the error in the registration process of multi-slice grids, the boundary of one grid needs to be projected to another grid before clipping in Fig.2.

Homomorphic filtering results of visual images.

Since the image fusion rules determine the final perception result, it is better to choose the appropriate fusion compliance rules that are more in line with the perception expectation to design the image perception experiment. We know that the image after pyramid decomposition will get the low frequency subgraph of near similar information of feature image and the high frequency subgraph of detail feature of feature image. Therefore, designing different perception rules for different features can better achieve high-quality image perception. For the same experimental image, if the entropy of the segmentation image obtained by a certain method is relatively large, it indicates that the performance of the segmentation method is better. In general, the segmentation effect of the proposed method is better than other segmentation methods. Whether it is objective evaluation criteria or direct observation of segmentation effect, it can be noted that the protection of color edge details in the center area is better than other methods.

Pulse coupling feed input is the main input source received by pulse elements, and neighboring pulse elements can influence the feed input signal of pulse elements through link mode. The external stimulus is received by the feed input domain and then coupled with the adjacent pulse element pulse signal received by the link input domain and sent to the internal activity item. The value of the internal activity term gradually increases with the cycle, while the dynamic threshold gradually decreases with the cycle t(i, j), and the value of the internal activity term is compared with the dynamic threshold for each cycle s(i ,j).

$$ A + B*t(i,j) + C*s(i,j) < 1 $$

(11)

$$ 10log ;(2.5^{ wedge } x - 2x - 1)^{ wedge } 2 < 1/log ;(2^{ wedge } x - x) $$

(12)

In contrast log(2^xx), as a simplified and improved model of PCNN model, LSCN (Long and Short Sequence Concerned Networks) continuously simplifies the input signal acquisition mechanism, and the total amount of undetermined parameters is greatly reduced. There are three leakage integrators in the traditional PCNN model, which need to perform two pulse coupling operations. In the LSCN model, there are also three leakage integrators, but only one pulse coupling operation is required. This determines that the time complexity of the LSCN model is lower than that of the traditional model, and it can be seen that the relationship between internal activity items and external incentives in this model is more direct. Not only that, different from traditional PCNN, the iteration process h(i, j)/x of LSCN model is automatically stopped rather than manually set, which is more convenient to operate in multiple iterations.

$$ sqrt {Delta h_{x} (i,j)/x + Delta h_{y} (i,j)/y + Delta h_{z} (i,j)/z} = 1 $$

(13)

$$ 1 - ln sum {|p(x) - p(x - 1)|} - ln p(x) in p(1 - x) $$

(14)

In the process of perception at this level p(x)p(x1), an independent preliminary judgment is made on each image and relevant conclusions are set up, and then each judgment and conclusion are perceived, so as to form the final joint judgment. The amount of data processed by the decision level perception method is the least among the three levels, and it has good fault tolerance and real-time performance, but it has more pre-processed data.

$$ X(a,b,c) = R(a,b)/c + G(c,b)/a + B(a,c)/b $$

(15)

Firstly, feature extraction X(a, b, c) is carried out on the original image, and then these features are perceived. Because the object perceived at this level is not the image but the characteristics of the image, it compreses the amount of data required to be processed to a certain extent, improves the efficiency and is conducive to real-time processing. The candidate regions, classification probabilities, and extracted features generated by the PCNN network are then used to train the cascade classifier. The training set at the initial time contains all positive samples and the same number of negative samples randomly sampled. The RealBoost classifier is followed by pedestrian classification.

The audience dataset labels age and gender disaggregated information together, suggesting that the model is actually a multi-task model, but does not explore the intrinsic relationship between the two tasks for better detection results. The model in Fig.3 had a gender identification accuracy of 66.8 percent on the audience dataset. However, these completely abandoned significance graphs actually contain some important significance information, which will cause the significance detection effect of PCNN model to be inaccurate. Therefore, it is necessary to reasonably perceive the significant information at each scale based on the significant information at the minimum entropy scale.Therefore, based on the saliency information at the minimum entropy scale, this paper takes the reciprocal of the corresponding entropy at other scales as the contribution rate to perceive the saliency information at other scales, so as to propose a multi-scale final saliency map determination method.

Information annotation of pulse coupling data set.

The visual boundary coefficient is more suitable for describing the difference between the visual boundary and the visual frame, and image enhancement is convenient for processing visual boundary detection. Based on the diffusion principle of nonlinear partial differential equation, the model can control the diffusion direction by introducing appropriate diffusion flux function, and can also be combined with other visual boundary detection methods. In order to verify that the superpixel-based unsupervised FCM color visual perception method proposed in this chapter can obtain the best segmentation effect, 50 images were selected from BSDS500 as experimental samples. Since the method proposed in this chapter can automatically obtain the cluster number C value, while the traditional clustering algorithm uses a fixed C value for each image, the fixed value of C and the method of automatically obtaining the cluster number C value will be used for the experiment respectively. The algorithm requires three essential parameters, namely, the weighting index, the minimum error threshold and the maximum number of iterations, which are respectively 2, 15 and 50 in this experiment, and the adjacent window size is set to 3*3.

As can be seen in Fig.4, although the perceptual image obtained by the maximum value method is optimal in the optical brightness of the image, its edge has more obvious "sawtooth" phenomenon and is more blurred. Compared with the source image, the perception image obtained by the discrete wavelet transform method has obvious shortcomings in saturation and brightness. From the perspective of visual effect, the perceptual image obtained by the visual perception transformation method has obvious edge oscillation effect. In contrast, the proposed image perception algorithm based on compressed sensing theory has achieved good visual effects in terms of clarity, contrast and detail representation. Visual boundary detection method based on visual boundary coefficient has certain shortcomings in practical application, if the visual boundary neighborhood between frame and frame shear in irregular change, the visual border visual boundary coefficient decreases, and it is also possible for video clips in the visual dithering and make the visual boundary coefficient increases, this could reduce the detection performance of the algorithm.

Image enhancement perception distribution.

If the minimum value of the interval in which the previous frame is located is equal to the minimum value of the minimum value of all subintervals in the search window, a further comparison is made in the subinterval in which the current frame is located. Since the search window of the current frame does not necessarily coincide exactly with the subinterval, the minimum value of the subinterval of the current frame boundary needs to be recalculated when determining the minimum value of the different subintervals (even without recalculation, the impact is limited).

Without the visual perception shared pulse coupling layer, P-Net's face detection and pedestrian detection will need to extract features from 224224 pixel images respectively, and the time spent training these two tasks will be doubled, and R-Net with 448448 pixel input will take even more time. At the same time, the internal connection of face detection and pedestrian detection has a special, most can locate face detection to the pedestrian detection box, so will face detection and pedestrian detection joint training can improve their accuracy. Obviously, it is simple and fast to segment PMA (Plane Moving Average) sequences according to 0 points, but many long motion patterns will be generated. Long motion mode is not conducive to key frame extraction, because it is difficult to express visual content according to long motion mode. Secondly, the long movement mode expressed by the triangular model will have a large error and is not accurate. At this point, we can separate the long motion mode into multiple motion modes. The method of separation is to determine the minimum point in the long motion pattern.

It can be seen that the performance of visual boundary detection using visual boundary coefficient and standard histogram intersection method has its own advantages and disadvantages, and the overall performance is equivalent. For the data set in Fig.5, the fixed min value detection method using visual boundary coefficients shows different properties. In the face of common noise attacks, the improved PCNN model achieves a higher Area Under Curve (AUC) value, which also indicates that the improved model has more robust robustness. If the cost of false visual boundary detection is equal to that of missed visual boundary detection, the visual boundary detection method using visual boundary coefficient is slightly inferior to the standard histogram intersection method on movie and video data sets. However, on the video dataset, the visual boundary detection method using visual boundary coefficients is slightly better than the standard histogram intersection method. If the cost of false and missed visual boundaries is not equal, the opposite is true. In general, the method using symmetric weighted window frame difference and moving average window frame difference is more stable and reliable than the method using 1/2- symmetric weighted window frame difference and 1/2- moving average window frame difference.

Parameter adjustment of boundary coefficient of visual perception.

See more here:

Simulation analysis of visual perception model based on pulse ... - Nature.com

Is running AI on CPUs making a comeback? – TechHQ

If somebody told you that a refurbished laptop could eclipse the performance of an NVIDIA A100 GPU when training a 200 million-parameter neural network, youd want to know the secret. Running AI routines on CPUs is supposed to be slow, which is why GPUs are in high demand, and NVIDIA shareholders are celebrating. But maybe its not that simple.

Part of the issue is that the development and availability of GPUs, which can massively parallelize matrix multiplications, has made it possible to brute force progress in AI. Bigger is better when it comes to both the amount of data used to train neural networks and the size of the models, reflected in the number of parameters.

Considering state-of-the-art large language models (LLMs) such as OpenAIs GPT-4, the number of parameters is now measured in the billions. And training what is, in effect, a vast, multi-layered equation by first specifying model weights at random and then refining those parameters through backpropagation and gradient descent is now firmly GPU territory.

Nobody runs high-performance AI routines on CPUs, or at least thats the majority view. The growth in model size, driven by the gains in accuracy, has led users to overwhelmingly favor much faster GPUs to carry out billions of calculations back and forth.

But the scale of the latest generative AI models is putting this brute force GPU approach to the test. And many developers no longer have the time, money, or computing resources to compete fine-tuning billions of artificial neurons that comprise the many-layered networks.

Experts in the field are asking if theres another, more efficient way of training neural networks to perform tasks such as image recognition, product recommendation, and natural language processing (NLP) search.

Artificial neural networks are compared to the workings of the human brain. But the comparison is a loose one as the human brain operates using the power of a dim light bulb, whereas state-of-the-art AI models require vast amounts of power, have worryingly large carbon footprints, and require large amounts of cooling.

That being said, the human brain consumes a considerable amount of energy compared with other organs in the body. But its orders of magnitude GPU-beating capabilities stem from the fact that the brains chemistry only recruits the neurons that it needs rather than having to perform calculations in bulk.

AI developers are trying to mimic those brain-like efficiencies in computing hardware by engineering architectures known as spiking neural networks. Neurons behave more like accumulators and fire only when repeatedly prompted. But its a work in progress.

However, its long been known that training AI algorithms could be made much more efficient. Matrix multiplications assume dense computations, but researchers have shown a decade ago that just picking the top ten percent of neuron activations will still produce high-quality results.

The issue is that to identify the top ten percent you would still have to run all of those sums in bulk, which would remain wasteful. But what if you could look up a list of those most active neurons based on a given input?

And its the answer to this question that opens up the path to running AI on CPUs, which is potentially game-changing as the observation that a refurbished laptop can eclipse the performance of an NVIDIA A100 GPU hints at.

So what is this magic? At the heart of the approach is the use of hash tables, which famously run in constant time (or thereabouts). In other words, searching for an entry in a hash table is independent of the number of locations. And Google puts this principle to work on its web search.

For example, if you type Best restaurants in London into Google Chrome, that query thanks to hashing, which turns the input into a unique fingerprint provides the index to a list of topical websites that Google has filed away at that location. And its why, despite having billions of websites stored in its vast index, Google can deliver search results to users in a matter of milliseconds.

And, just as your search query in effect provides a lookup address for Google, a similar approach can be used to identify which artificial neurons are most strongly associated with a piece of training data, such as a picture of a cat.

In neural networks, hash tables can be used to tell the algorithm which activations need to be calculated, dramatically reducing the computational burden to a fraction of brute force methods, which makes it possible to run AI on CPUs.

In fact, the class of hash functions that turn out to be most useful are dubbed locally sensitive hash (LSH) functions. Regular hash functions are great for fast memory addressing and duplicate detection, whereas locally sensitive hash functions provide near-duplicate detection.

LSH functions can be used to hash data points that are near to each other in other words, similar into the same buckets with high probability. And this, in terms of deep learning, dramatically improves the sampling performance during model training.

Hash functions can also be used to improve the user experience once models have been trained. And computer scientists based in the US at Rice University, Texas, Stanford University, California, and from the Pocket LLM pioneer ThirdAI, have proposed a method dubbed HALOS: Hashing Large Output Space for Cheap Inference, which speeds up the process without compromising model performance.

As the team explains, HALOS reduces inference into sub-linear computation by selectively activating only a small set of likely-to-be-relevant output layer neurons. Given a query vector, the computation can be focused on a tiny subset of the large database, write the authors in their conference paper. Our extensive evaluations show that HALOS matches or even outperforms the accuracy of given models with 21 speed up and 87% energy reduction.

Commercially, this approach is helping merchants such as Wayfair an online retailer that enables customers to find millions of products for their homes. Over the years, the firm has worked hard to improve its recommendation engine, noting a study by Amazon that even a 100-millisecond delay in serving results can put a noticeable dent in sales.

And, sticking briefly with online shopping habits, more recent findings published by Akamai report that over half of mobile website visitors will leave a page that takes more than three seconds to load food for thought as half of consumers are said to browse for products and services on their smartphones.

All of this puts pressure on claims that clever use of hash functions can enable AI to run on CPUs. But the approach more than lived up to expectations, as Wayfair has confirmed in a blog post. We were able to train our version three classifier model on commodity CPUs, while at the same time achieve a markedly lower latency rate, commented Weiyi Sun Associate Director of Machine Learning at the company.

Plus, as the computer scientists described in their study, the use of hash-based processing algorithms accelerated inference too.

Here is the original post:

Is running AI on CPUs making a comeback? - TechHQ

AI’s Transformative Impact on Industries – Fagen wasanni

Artificial intelligence (AI) has made remarkable progress in recent years, revolutionizing various industries and capturing the imagination of experts worldwide. Several notable research projects have emerged, showcasing the immense potential of AI and its transformative impact on different sectors.

One prominent project is DeepMinds AlphaFold, an AI system that accurately predicts protein folding structures using deep learning algorithms. This breakthrough has the potential to revolutionize bioinformatics and accelerate drug discovery processes by enabling a better understanding of protein structures and their functions.

In the healthcare industry, IBM Watsons cognitive computing capabilities have paved the way for personalized medicine and improved diagnostics. Watson can analyze vast amounts of patient data, medical research, and clinical guidelines to provide evidence-based treatment recommendations. Its application in oncology has shown promising results, aiding doctors in making informed decisions and improving patient outcomes.

Another notable project is Google Brain, an AI system introduced in 2011. Google Brain focuses on open learning and aims to emulate the functioning of the human brain as closely as possible. It has achieved significant success in simulating human-like communication between AI entities, demonstrating the learning capabilities and adaptability of AI systems.

Google Brains Transformer, a neural network architecture, has revolutionized natural language processing and machine translation. Its attention mechanism allows the model to focus on relevant parts of the input sequence, overcoming the limitations of traditional neural networks. The Transformer has significantly improved translation quality and found success in various NLP tasks and computer vision tasks.

Lastly, Google DeepMinds AlphaGo is a milestone in AI, beating world champions in the game of Go and pushing the boundaries of AI in strategic board games. The development of AlphaGo Zero, which relies solely on reinforcement learning, marked a true breakthrough in AI mastery.

These projects demonstrate the transformative impact of AI on various industries, from healthcare to language processing to strategic games. AI continues to push boundaries and open new possibilities in different fields, promising a future of boundless possibilities.

Read the original post:

AI's Transformative Impact on Industries - Fagen wasanni

ASU researchers bridge security and AI – Full Circle

Fast-paced advancements in the field of artificial intelligence, or AI, are proving the technology is an indispensable asset. In the national security field, experts are charting a course for AIs impact on our collective defense strategy.

Paulo Shakarian is at the forefront of this critical work using his expertise in symbolic AI and neuro-symbolic systems, which are advanced forms of AI technology, to meet the sophisticated needs of national security organizations.

Shakarian, an associate professor of computer science in the School of Computing and Augmented Intelligence, part of the Ira A. Fulton Schools of Engineering at Arizona State University, has been invited to attend AI Forward, a series of workshops hosted by the U.S. Defense Advanced Research Projects Agency, or DARPA.

The event includes two workshops: a virtual meeting that took place earlier this summer and an in-person event in Boston from July 31 to Aug. 2.

Shakarian is among 100 attendees working to advance DARPAs initiative to explore new directions for AI research impacting a wide range of defense-related tasks, including autonomous systems, intelligence platforms, military planning, big data analysis and computer vision.

At the Boston workshop, Shakarian will be joined by Nakul Gopalan, an assistant professor of computer science, who was also selected to attend the event to explore how his research in human-robot communication might help achieve DARPAs goals.

In addition to his involvement in AI Forward, Shakarian is preparing to release a new book in September 2023. The book, titled Neuro-symbolic Reasoning and Learning, will explore the past five years of research in neuro-symbolic AI and help readers understand recent advances in the field.

As Shakarian and Gopalan prepared for workshops, they took a moment to share their research expertise and thoughts on the current landscape of AI.

Explain your research areas. What topics do you focus on?

Paulo Shakarian: My primary focus is symbolic AI and neuro-symbolic systems. To understand them, its important to talk about what AI looks like today, primarily as deep learning neural networks, which have been a wonderful revolution in technology over the last decade. Looking at problems specifically relevant to the U.S. Department of Defense, or DoD, these AI technologies were not performing well. There are several challenges, including black box models and their explainability, systems not being inherently modular because theyre trained end-to-end, and the enforcement of constraints to help avoid collisions and interference when multiple aircrafts share the same airspace. With neural networks, theres no inherent way in the system to enforce constraints. Symbolic AI has been around longer than neural networks, but it is not data-driven, while neural networks are and can learn symbols and repeat them back. Traditionally, symbolic AIs abilities have not been demonstrated anywhere near the learning capacity of a neural network, but all the issues Ive mentioned are shortcomings of deep learning that symbolic AI can address. When you start to get into these use cases that have significant safety requirements, like in defense, aerospace and autonomous driving, there is a desire to leverage a lot of data while accounting for safety constraints, modularity and explainability. The study of neuro-symbolic AI uses a lot of data with those other parameters in mind.

Nakul Gopalan: I focus on the area of language grounding, planning and learning from human users for robotic applications. I attempt to use demonstrations that humans provide to teach AI systems symbolic ideas, like colors, shapes, objects and verbs, and then map language to these symbolic concepts. In that regard, I also develop neuro-symbolic approaches to teaching AI systems. Additionally, I work in the field of robot learning, which involves implementing learning policies to help robots discover how to solve specific tasks. Tasks can range from inserting and fastening bolts in airplane wings to understanding how to model an object like a microwave so a robot can heat food. Developing tools in these large problem areas in machine learning and artificial intelligence can enable robots to solve problems with human users.

Tell me about your research labs. What research are you currently working on?

PS: The main project Ive been working on in my lab, Lab V2, is a software package we call PyReason. One of the practical results of the neural network revolution has been really great software like PyTorch and TensorFlow, which streamline a lot of the work of making neural networks. Google and Meta put considerable effort into these pieces of software and made them free to everyone. Weve noticed in neuro-symbolic literature that everyone is reinventing the wheel, in a sense, by creating a new subset of logic for their particular purposes. Much of this work already has copious amounts of literature previously written on it. In creating PyReason, my collaborators and I wanted to create the best possible logic platform for working with machine learning systems. We have about three or four active grants with it, and people have been downloading it, so it has been our primary work. We wanted to create a very strong piece of software to enable this research, so you dont have to keep reimplementing old bits of logic. This way its all there, its mature and relatively bug-free.

NG: My lab, the Logos Robotics Lab, focuses on teaching robots a human approach to learning and solving tasks. We also work on representations for task solving to understand how robots can model objects so they can solve the tasks we need robots to solve. Like learning how to operate a microwave, for example, and understanding how to open its door and put an object inside. We use machine learning techniques to discover robots behavior and focus on teaching robots tasks from human users to sample efficient machine learning methods. Our team learns about object representations such as modeling microwaves, toasters and pliers to understand how robots can use them. One concept we work on is tactile sensing, which helps to recognize objects and use them for solving tasks by touch. We do all this with a focus on integrating these approaches with human coworker use cases so we can demonstrate the utility of these learning systems in the presence of a person working alongside the robot. Our work touches practical problems in manufacturing and socially relevant problems, such as introducing robots into domains like assisted living and nursing.

What initially drew you to engineering and drove you to pursue work in this field?

PS: I had an interesting journey to get to this point. Right out of high school, I went to the United States Military Academy at West Point, graduated, became a military officer and was in the U.S. Armys 1st Armored Division. I had two combat tours in Iraq, and after my second combat tour, my unit sent me on a three-month temporary assignment to DARPA as an advisor because I had combat experience and a technical degree a bachelors degree in computer science. At DARPA, I learned how some of our nations top scientists were applying AI to solve relevant defense problems and became very interested in both intelligence and autonomy. Being trained in military intelligence, Ive worked in infantry and armor units to understand how intelligence assets were supporting the fight, and I saw that the work being done at DARPA was lightyears beyond what I was doing manually. After that, I applied to a special program to go back to graduate school and earned my doctoral degree, focusing on AI. As part of that program, I also taught for a few years at West Point. After completing my military service, I joined the faculty at ASU in 2014.

NG: I have been curious about learning systems related to control and robotic applications since my undergraduate degree studies. I was impressed by the capability of these systems to adapt to a human users needs. As for what drew me to engineering, I was always fascinated by math and even competed in a few math competitions in high school. A career in engineering was a way for me to pursue this interest in mathematics for practical applications. A common reason for working in computer science research is its similarity to the mathematics field. The computer science field can solve open-ended theoretical problems while producing practical applications of this theoretical research. Our work in the School of Computing and Augmented Intelligence embodies these ideals.

Theres so much hysteria and noise in the media about AI. Speaking as professional researchers in this field, are we near any truly useful applications that are going to be game changers for life in various industries?

PS: Yes, I think so. Weve already seen what convolutional neural networks did for image recognition and how that has been embedded in everything from phones to security cameras and more. Were going to see a very similar phenomenon going on with large language models. The models have problems, and the main one is a concept called hallucinations, which means the models give the wrong answers or information. We also cant have any strong safety guarantees with large language models if you cant explain where the results came from, which is the same problem with every other neural model. Companies like Google and OpenAI are doing a lot of testing to mitigate these potential issues that could come out, but theres no way they could test every possible case. With that said, I expect to see things like the context window, or the amount of data you can put in a prompt, expand with large language models in the next year. Thats going to help improve both the training and use of these models. There have been a lot of techniques introduced in the past year that will significantly improve the accuracy in everyday use cases, and I think the public will see a very low error rate. Large language models are crucial in generating computer code, and thats likely to be the most game-changing, impactful result. If we can write code faster, we can inherently innovate faster. Large language models are going to help researchers continue to act as engines of innovation, particularly here in the U.S. where these tools are readily available.

Large language models are crucial in generating computer code, and thats likely to be the most game-changing, impactful result. If we can write code faster, we can inherently innovate faster.

NG: Progress in machine learning has been meteoric. We have seen the rise of generative models for language, images, videos and music in the last few years. There are already economic consequences of these models, which were seeing in industries such as journalism, writing, software engineering, graphic design, law and finance. We may one day see fewer of these kinds of jobs as our efficiency in pursuing this advancement increases, but there are still questions about the accuracy and morality of using such technology and its lasting social and economic impacts. There is some nascent understanding of the physical world in these systems, but they are still far from being efficient when collaborating with human users. I think this technology will change the way we function in society just as introducing computers changed the type of jobs people aspire toward, but researchers are still focused on developing the goal of artificial general intelligence, which is Ai that understands the physical world and functions independently in it. We are still far from such a system, although we have developed impressive tools along the way.

Do you think AIs applications in national security will ever get to a point where the public sees this technology in use, such as the autonomous vehicles being tested on roads in and around Phoenix, or do you think it will stay behind the scenes?

PS: When I ran my startup company, I learned that it was important for AI to be embedded in a solution that everyone understands on a daily basis. Even with autonomous vehicles, the only difference is that theres no driver in the drivers seat. The goal is to get these vehicles to behave like normal cars. But the big exception to all of this is ChatGPT, which has turned the world on its head. Even with these technologies, I have a little bit of doubt that our current interface will be the way we interact with these types of AI going forward, and the people at OpenAI agree.

I see further development in the future to better integrate technology like ChatGPT into a normal workflow. We all have tools we use to get work done, and there are always small costs associated. With ChatGPT, theres the cost of flipping to a new window, logging into the program and waiting for it to respond. If youre using it to craft an email thats only a few sentences long, it may not feel worth it, and then you dont think of this as a tool to make an impact as often as you should. If ChatGPT were more integrated into processes, I think use of it would be different. Its such a compelling technology and I think thats why they were able to release it in this very simple, external chat format.

NG: We use a significant amount of technology developed for national security for public purposes, in applications from the internet to GPS devices. As technology becomes more accessible, it continues to be declassified and used in public settings. I expect the same will happen for most such research products developed by DARPA.

Link:

ASU researchers bridge security and AI - Full Circle

Spatial attention-based residual network for human burn … – Nature.com

Accurate diagnosis of human burns requires a sensitive model. ML and DL are commonly employed in medical imaging for disease diagnosis. ResNeXt, AlexNet, and VGG16 are state-of-the-art deep-learning models frequently utilized for medical image diagnosis. In this study, we evaluated and compared the performance of these models for diagnosing burn images. However, these models showed limited effectiveness in accurate diagnosis of burn degree and distinguishing grafts from non-grafts.

ResNeXt, a deep residual model, consists of 50 layers, while AlexNet and VGG16 are sequential models with eight and 16 layers, respectively. These layers extract features from the burned images during the models training process. Unfortunately, distinguishing between deep dermal and full-thickness burns can be challenging, as they share similar white, dark red, and brown colors. Consequently, highly delicate and stringent methods are required for accurate differentiation. AlexNet and VGG16, being sequential models, mainly extract low-level features, whereas ResNeXt excels in extracting high-dimensional features. A limitation is that these models can only learn positive weight features due to the ReLu activation function. This constraint may hinder their ability to precisely identify critical burn characteristics. The DL models, AlexNet, ResNeXt, VGG16, and InceptionV3 are widely used for medical image diagnosis, however, these models encounter challenges in accurately categorizing burn degrees and differentiating grafts from non-grafts. Finding effective ways to handle these challenges and improve feature extraction could lead to more sensitive and reliable burn diagnosis models.

The ResNeXt model33 influenced the BuRnGANeXt50 model. To construct a BuRnGANeXt50 model, the original ResNeXt models topology is modified. Moreover, the original ResNeXt was created to classify images into several categories with high computation costs. In this study, the method performs a multiclass and binary class classification task. Multiclass classification is used to assess burn severity based on burn depth. After that, based on depth, burns may be broken down into two distinct types: graft and non-graft. Reducing the first layer filter size from 77 to 55 is the first change to the original ResNext models design because a larger filter size resulted in lower pixel intensity in the burnt region. This has led to a rise in the frequency of spurious negative results for both grafts and non-grafts. Furthermore, the convolution sizes of Conv1, Conv2, Conv3, Conv4, and Conv5 are also changed to reduce the computation cost while maintaining cardinality. Furthermore, we applied Leaky ReLu instead of the ReLU activation for faster model convergence. Table 2 also shows that conv2, conv3, and conv4 are shrinking in size. After implementing all modifications, neurons decreased from 23106 to 5106, as shown in Table 3. The detailed architecture of the proposed model is shown in Fig.1.

Topology of BuRnGANeXt50 for human burn diagnosis.

This model has several essential building blocks, including convolution, residual, ReLU, activation, softmax, and flattened layer. The results of groups convolution of neurons inside the same kernel map are summed together by pooling layers, which reduce the input dimensionality and enhance the model performance. The pooling units in the proposed model constitute a grid, with each pixel representing a single voting location, and the value is selected to gain overlap while reducing overfitting. Figure2 describes the structure of the models convolution layer. Polling units form a grid, each pixel representing a single voting place being centered (z times z). In the provided model, we employ the standard CNN with parameters set to (S = z), but we add a charge of (S < z) to increase overlap and decrease overfitting34. The proposed architecture was developed to handle the unique issues of burn diagnosis, emphasizing decreasing overfitting and enhancing model accuracy.

The pooling layers are convolutions in a grouped manner.

The inner dot product is an essential part that neurons perform for the foundation of an artificial neural networks convolutional and fully connected layers. The inner dot product may compute the aggregate transform, as illustrated in Eq.(1).

$$mathop sum limits_{i = 1}^{K} w_{i} rho_{i}$$

(1)

represents the neurons k-channel input vector. Filter weight is given by (w_{i})for i-the neurons. This model replaces the elementary transformations with a more generic function (left( {w_{i} rho_{i} } right)). By expanding along a new dimension, this generic function reduces depth. This model calculates the aggregated transformations as follows:

$${Im }left( rho right) = mathop sum limits_{i = 1}^{{mathbb{C}}} Upsilon_{i} left( rho right)$$

(2)

The function (Upsilon_{i} (rho )) is arbitrarily defined. (Upsilon_{i}) project (rho) into low-dimensional embedding and then change it, similar to a primary neuron. ({mathbb{C}}) represents the number of transforms to be summed in Eq.(2). ({mathbb{C}}) is known as cardinality35. As the residual function, Eq.(2)s aggregated transformation serves36. (Fig.3):

$$x = rho + mathop sum limits_{i = 1}^{{mathbb{C}}} Upsilon_{i} left( rho right)$$

(3)

where (x) is the models predicted result.

Channel and spatial attention modules are depicted in (A) and (B), respectively, in these schematic illustrations.

Finally, at the top of the model a flattened and a global average pooling is added. The Softmax activation classifies burn into binary and multiclass. The softmax optimizer uses the exponent of each output layer to convert logits to probabilities37. The vector (Phi) is the system input, representing the feature set. Our study uses k classification when there are three levels of burn severity (k=3) and two levels of graft versus non-graft (k=2). For predicting classification results, the bias (W_{0} X_{0}) is added to each iteration.

$$p(rho = i|Phi^{left( j right)} ) = frac{{e^{{Phi^{left( j right)} }} }}{{mathop sum nolimits_{i = 0}^{k} e^{{Phi_{k}^{left( j right)} }} }}$$

(4)

$${text{In}};{text{which}};Phi = W_{0} X_{0} + W_{1} X_{1} + ldots + W_{k} X_{k}$$

(5)

The residual attention block, which allows attention to be routed across groups of separate feature maps, is shown in Fig.3. Furthermore, the channels extra feature map groups combine the spatial information of all groups via the spatial attention module, boosting CNNs capacity to represent features. It comprises feature map groups, feature transformation channels, spatial attention algorithms, etc. Convolution procedures can be performed on feature groups, and cardinality specifies the number of feature map groups. A new parameter, "S," indicates the total number of groups in the channel set38 and the number of subgroups in each of the N input feature groups. A channel scheduler is a tool that optimizes the processing of incoming data through channels. This method transforms feature subsets. G=N * S is the formula for the total number of feature groups.

Using Eq.(6), we conduct an essential feature modification on subgroups inside each group after channel shuffling.

$$gleft( {r,i,j} right) = left[ {begin{array}{*{20}c} {cos frac{rpi }{2}} & { - sin frac{rpi }{2}} \ {sin frac{rpi }{2}} & {cos frac{rpi }{2}} \ end{array} } right]left[ {begin{array}{*{20}c} i \ j \ end{array} } right]$$

(6)

Here (0le r<4,left(i,jright)) stands for the original matrixs coordinates. K represents the 33 convolution of the bottleneck block, and Output is written as (y_{s}). Then, for each (x_{s}) input

we have:

$$y_{s} = left{ {begin{array}{*{20}c} {Kleft( {g_{r} left( {x_{s} } right)} right)r,} & {s = 0} \ {Kleft( {g_{r} left( {x_{s} } right)} right) odot y_{0} } & {0 < r = s < 4} \ end{array} } right.$$

(7)

(g& r) here represents the input (x_{s}). (odot) corresponds to element multiplication in the matrixs related feature transformation. Features of x being transformed are shared across the three 33 convolution operators K.

Semantic-specific feature representations can be improved by exploiting the interdependencies among channel graphs. We use the feature maps channels as individual detectors. Figure3A depicts how we send the feature map of the (noin mathrm{1,2},...,N) group ({G}^{no}in {R}^{C/Ntimes Htimes W}) to the channel attention module. As a first step, we use geographic average pooling (GAP) to gather global context information linked to channel statistics39. The 1D channel attention maps ({C}^{no}in {R}^{C/N}) are then inferred using the shared fully connected layers.

$$C^{n} = D_{sigmoid} left( {D_{{{text{Re}} LU}} left( {GAPleft( {G_{n} } right)} right)} right)$$

(8)

("{D}_{sigmoid}and{D}_{mathit{Re}LU}") represents a fully linked layer that uses both "Sigmoid" and "ReLU" as activation functions. At last, Hadamard products are used to infer a groups attention map and the corresponding input features. Then the components from each group are weighted and added together to produce an output feature vector. The final channel attention map

$$C in R^{C/N times H times W} C = mathop sum limits_{n = 1}^{N} left( {C^{n} odot G^{n} } right)$$

(9)

Each groups 11 convolution kernel weight is multiplied by the 33 kernel weight from the subgroups convolutional layer. The global feature dependency is preserved by adding the groups channel attention weights, which all add up to the same value.

A spatial attention module is used to synthesize spatial links and increase the spatial size of associated features. The channel attention module is separate from that component. The spatial information of feature maps is first aggregated using global average pooling (GAP) and maximum global pooling (GMP)39 to obtain two distinct contextual descriptors. Next, by joining (GAP(C)in {R}^{1times Htimes W}andGMP(C)in {R}^{1times Htimes W}) connect to get ({S}_{c}in {R}^{2times Htimes W}).

$$S_{c} = GAPleft( C right) + GMPleft( C right)$$

(10)

The plus sign +denotes a linked feature map. The regular convolutional layer retrieves the spatial dimensional weight information to round things out. (S_{conv}) Final spatial attention map (Sin {R}^{C/Ntimes Htimes W}) is obtained by element-wise multiplying the input feature map (C) with itself.

$$S = Conv_{3 times 3} left( {S_{C} } right) odot C$$

(11)

("Con{v}_{3times 3}") means regular convolution, while "Sigmoid" denotes the activation function.

Leaky ReLU activation-based deep learning models do not rely on input normalization for saturation. Neurons in this model are more efficient at learning from negative inputs. Despite this, neural activity is calculated ({alpha }_{u,v}^{i}) At a point ((u,v)) by using the kernel (i), which facilitates generalization. The ReLU nonlinearity is then implemented. The ReLU nonlinearity is then implemented. The response normalized ({alpha }_{u,v}^{i}) is determined using the provided Eq.(12).

$$b_{u,v}^{i} = frac{{alpha_{u,v}^{i} }}{{left( {t + alpha mathop sum nolimits_{j - max (0,i,n/2)}^{min (N,1,i + n/2)} (alpha_{u,v}^{j} )^{2} } right)^{beta } }}$$

(12)

where (N) are the total number of layers and (t,alpha ,n,beta) are constants? This (sum {}) is computed for each of the (n) neighboring40. We trained the network using a (100 times 100 times 3) picture and the original ResNeXt CNN topologys cardinality hyper-parameter ({mathbb{C}}=32). The algorithm of the proposed method is shown below.

Algorithm of the proposed method.

All authors contributed to the conception and design of the study. All authors read and approved the final manuscript.

Excerpt from:

Spatial attention-based residual network for human burn ... - Nature.com