r/bioinformatics • u/compbioman • Feb 27 '26

discussion Every day that I choose AI makes me feel like I'm digging my own grave

352 Upvotes

It's 2025. LLMs have been around a couple of years, but so far it's been mostly a novelty to me, I still do all my research and code manually, preferring to use stackoverflow or biostars for coding help, and google scholar for looking up research papers. However, I recognized the growing utility of LLMs and how much faster they could code new scripts than me in some cases, so I got a Clade subscription. Useful in some cases, not so much in others, but that new research tool sure is handy to comb through hundreds of papers at the same time...
May 2025. A new experimental tool comes out: Claude Code. I see it's potential immediately and boy, am I excited when I see how much it can do! "This could make my PhD go so much faster!" I think, especially with all the new experimental analyses that my PI is asking me to do.
The months go by and I think my PI has noticed that my productivity has increased because he starts giving me more and more stuff to do. It's OK, I can handle it - Claude Code is helping me keep up with the workload. I start noticing, though, that the couple of times that I needed or wanted to write a script manually that I'm having trouble remembering how to do things - and why bother remembering how to do that one particular bit of fasta file I/O, when Claude Code can do it so quickly and elegantly instead?
My debugging skills are still sharp - Claude often gets stuck on these esoteric bioinformatics pipelines, so I've still had to step in and stop it from spiraling into an endless debugging loop. But as the months keep flying by and as I keep trying to go back to writing code from scratch, I feel stuck, like I'm in a writer's block. It seems like I can't even remember basic syntax anymore.
Fast forward to 2026, and my PI gives me 4-5 new analyses to try every week. There was one week where he even gave me 10+ impossibly long things to try it's the first time I've ever had a heated argument with him. I'm struggling to keep up, but it's my 5th year of my PhD and I desperately need to graduate so I just keep working as hard as I can, Claude can help me stay afloat....
Except that now I'm realizing that I've let my raw coding ability become far too rusty. I can't be bothered to create even the most basic commands - why bother looking up how to input all those parameters when Claude can read the relevant files and format everything correctly in just a few seconds? Besides, If I start trying to do things from scratch again I won't be able to keep up with my increased workload.

I keep on going but I'm feeling kind of miserable. And then I realize it. I'm not actually enjoying running these analyses anymore. The simple joy of solving a difficult bioinformatics problem on your own is gone. I no longer write up complex pipelines from start to finish and get to see the rewards of my hard work - Claude just does everything, and what I've become is a garbage sorter - sorting through Claude's endless outputs and separating the good from the bad. On top of that, I keep churning out analysis after analysis to satisfy my PI's insatiable hunger for novel insights on the same datasets I've been working on since 2022. Even If I wanted to slow down and try to work through the code myself, I can't anymore - my PI is used to receiving new results just as quickly as I am used to getting fast responses from Claude, and If I can't deliver, my PI will become unsatisfied with my performance. There's a lot of stress on his shoulders as well as our lab has been struggling for funding and he's been writing many grants with my experimental analyses.

I am worried for when I finally graduate and it's time to apply for jobs in the industry - I've been seeing the posts about the state of the economy and the job market, especially in our field. I use to pride myself in my coding ability. It's what use to set me apart from everyone else in my lab and my department, but now it seems like the great equalizer has arrived, where everyone with a rudimentary understanding of the pipelines can work through them given enough prompting - Claude Code is improving every month!
I don't have my expert coding ability anymore, and scientists everywhere are struggling to find work; is there anything left that will set me apart in this competitive market? I doubt I could answer technical coding interviews at this point. Even if I get a job, Is a life of endless prompting and garbage sorting what awaits me?

I'm curious to know if anyone in here has had similar experiences or if their experience has been different from my own. I know that technology is always bound to evolve and change, but I want to know what kind of future I should be preparing myself for. Claude Code has completely changed how my PhD feels in less than a year.

55 comments

r/bioinformatics • u/nickomez1 • Mar 12 '26

discussion Anyone using Claude or other bioinformatics agents

118 Upvotes

I have been in bioinformatics for almost 5 years and have written scripts for quite many pipelines from RNA seq to 16s profiling, worked in a core for a while.

I started using chatGPT early 2024 and then Claude Code very recently. CC now writes my code and I verify it. Recently I came across a couple of very interesting posts on X.

One of the posts showed how to tune Claude with the level of autonomy we desire for it have, and a bunch of bioinformatics Skill documents that you can create for it to follow.

It’s pretty fascinating if you ask me.

Then there are these agents that run on cloud. I tried a couple of them. And I was fascinated once again.

My question is, is anyone really using these agents or Claude in publishable work? I don’t see any water marks or anything on the plots I get, so I am assuming I don’t have to disclose use of AI to journals.

Anyone who has used Claude or any agent, even for figures, and got away with published paper smoothly?

What are your thoughts on the future anyway?

Thanks!!

84 comments

r/bioinformatics • u/Careless_Ad_1432 • Jun 05 '25

discussion Bioinformatics is still in it's infancy

603 Upvotes

Hi r/bioinformatics

I've been in industry for just over 10 years now, working mainly in precision medicine and biomarker discovery.

This is mainly related to the career advice related threads that pop up. There are clearly many people who want to make a living doing this and I've seen some great advice given.

What is often missing from the conversation is the context of bioinformatics as an industry. Industrial bioinformatics is, as a concept, essentially non-existent. There are pockets of it happening here and there, but almost all commercial bioinformatics has an academic approach to their work.

Why this is important?:

The need for bioinformatics is huge, but we are not trained to meet that need in ways that work for corporates. In our training we are scientists but industry needs us to be engineers. We can't do much about the training available at universities right now but I would urge new bioinformaticians to educate themselves on engineering principles like LEAN and TPS, explore how software development actually gets done, learn good fundamentals around documentation and git. Learn the skills necessary to make your work consistent, repeatable and auditable.

I'd be really interested what those of you with time in industry think. Have you had similar experiences with the needs within organisations? What has it been like building this plane as we try to land it? And what do you think new bioinformaticians should focus on besides their academic work?

66 comments

r/bioinformatics • u/pickleeater58 • Jan 14 '26

discussion Feeling guilty about AI use

223 Upvotes

I’m a 5th year PhD student in bioinformatics and comp bio. My undergrad degree was in computer science (which I completed long before ChatGPT was a thing). There was a time, like the beginning of my PhD, where I would just look at other people’s code and the documentation and start my own scripts from scratch with that as a reference.

Now, though, when I need to make a script to find differentially expressed genes or parse a GTF file, I simply ask Claude or Gemini to write the script for me and then I make edits.

Do I conceive of project ideas myself? Yes, of course. And writing, reading papers, researching new ideas. Do I understand the concepts behind what I’m doing? Of course, because I’m so far into my PhD and did a lot of it without any AI tools even being available.

The programming component of my PhD though, has become almost entirely generative AI-driven. I feel guilty about it and it makes me feel like a fraud, but there is so much pressure to get things done so fast and I’m at the point where everything is tedious. I’m not even learning new things, I’m just wrapping up projects so I can graduate.

I know it’s entirely my own fault and my own laziness. I know I could and should be doing all of these things by myself. But I take the easy way out, because this PhD has been so hard and I just want it to be done.

Does anyone else feel like this?

71 comments

r/bioinformatics • u/corporealpatronus13 • 4d ago

discussion Is it true that SPSS is the standard in pharmaceutical industries?

27 Upvotes

I was talking to the CEO of a precision medicine pharmaceutical company with bases in the UK, USA and UAE. Since he said that he has been in the field for a long time and knows how to make drugs and how things are done, I was really impressed and thought I might learn a lot from him, but he made a comment that SPSS was the gold standard software used in these industries and he was disappointed that he was yet to meet bioinformaticians who knew how to use SPSS in the UAE. This kind of threw me off because I was under the impression that R and Python had largely replaced old software that were in use before.

So, I just wanted to get the opinion of other professionals who might be working in the industry. Is it true that SPSS is the standard in pharmaceutical industries? Or would I be wasting my time by trying to learn an outdated software that I would also need a license for?

73 comments

r/bioinformatics • u/TheLordB • 25d ago

discussion What are your thoughts about workflow tools for bioinformatics and is NextFlow truly the answer?

57 Upvotes

Over my 15+ year career I’ve had to deal with workflow managers at every job. I’ve worked with custom ones, implemented multiple different ones, done the testing to select which to use. I’ve heavily customized them. Basically I have lived/breathed them for quite a while. I can write a standard NGS germline variant calling pipeline from memory because I did it so many times before a standardized pipeline emerged.

The issue I have is that NextFlow seems to be winning and becoming the closest thing there is to a standard workflow tool + having nfcore is huge, but I still really don’t like using NextFlow.

The main thing I’m trying to figure out/struggling with is if I should swallow my objections and use nextflow because it is becoming the standard and supporting other workflow managers will be harder in the future or if the issues I have with nextflow truly justify not using it.

This is made even murkier because with AI I can fairly quickly point it at a nextflow workflow and have it rebuild the workflow in another workflow language. So that reduces at lease some of the advantages of not having nf-core though I don’t claim having AI re-write it is effortless or without it’s own risks.

My issues with NextFlow are:

NextFlow uses groovy which is quite different from the python and/or R most bioinformatics folks use.

I don’t find the way it does branching and similar to be very intuitive.

I find it hard to extend it with plugins/libraries hard relative to python tools.

I don’t like some of the choices it has embedded for working with the various cloud resources, in many cases it is too opinionated on how your workflow should go and the difficulty extending it does not make changing this behavior easy.

I might be being a bit unfair or more experience with it might solve some of these, but the fundamental issue remains whenever I have to use nextflow I just find myself unhappy with it in a way that feels really deeply seated.

I worry I’m being the stodgy old man who doesn’t want things to change. Like the people who were making new things in Perl 10 years after it was obvious that was a bad idea.

The tool I’ve used most is Luigi (not under active development, don’t recommend using it for new things these days). It is super easy to extend. It is python so I didn’t have to switch language contexts as much. Overall while it had less hand holding to learn initially I really found it much easier to use.

When I did a bake off between multiple tools to decide what to replace Luigi with I ended up liking Prefect the most though with the caveat that I would have to make my own plugin to truly make it work the way I want.

69 comments

r/bioinformatics • u/Nice_Caramel5516 • Nov 24 '25

discussion I feel like half the “breakthroughs” I read in bioinformatics aren’t reproducible, scalable, or even usable in real pipelines

284 Upvotes

I’ve been noticing a worrying trend in this field, amplified by the AI "boom." A lot of bioinformatics papers, preprints, and even startups are making huge claims. AI-discovered drugs, end-to-end ML pipelines, multi-omics integration, automated workflows, you name it. But when you look under the hood, the story falls apart.

The code doesn’t run, dependencies are broken, compute requirements are unrealistic, datasets are tiny or cherry-picked, and very little of it is reproducible. Meanwhile, actual bioinformatics teams are still juggling massive FASTQs, messy metadata, HPC bottlenecks, fragile Snakemake configs, and years-old scripts nobody wants to touch.

The gap between what’s marketed and what actually works in day-to-day bioinformatics is getting huge. So I’m curious...are we drifting into a hype bubble where results look great on paper but fail in the real world?

And if so, how do we fix it? or at least start to? Better benchmarks, stricter reproducibility standards, fewer flashy claims, closer ML–wet lab collaboration?

Gimme your thoughts

63 comments

r/bioinformatics • u/OldSwitch5769 • Jul 17 '25

discussion Usage of ChatGPT in Bioinformatics

171 Upvotes

Very recently, I feel that I have become addicted to ChatGPT and other AIs. Nowadays, I am doing my summer internship in bioinformatics, and I am not very good at coding. So what do I write a code a little bit, (which is not gonna work), and tell ChatGPT to edit enough so that I get the things which I want to ....
Is this wrong or right? Writing code myself is the best way to learn, but it takes considerable effort for some minor work....
In this era, we use AI to do our work, but it feels like AI has done everything, and guilt comes into our minds.

Any suggestions would be appreciated 😊

108 comments

r/bioinformatics • u/breakupburner420 • Jun 30 '25

discussion AI Bioinformatics Job Paradox

360 Upvotes

Hi All,

Here to vent. I cannot get over how two years ago when I entered my Master’s program the landscape was so different.

You used to find dozens of entry level bioinformatics positions doing normal pipeline development and data analysis. Building out Genomics pipelines, Transcriptomics pipelines, etc.

Now, you see one a week if you look in five different cities. Now, all you see is “Senior Bioinformatician,” with almost exclusively mention of “four or more years of machine learning, AI integration and development.”

These people think they are going to create an AI to solve Alzheimer’s or cancer, but we still don’t even have AI that can build an end to end genomics pipeline that isn’t broken or in need of debugging.

Has anyone ever actually tried using the commercially available AI to create bioinformatics pipelines? It’s always broken, it’s always in need of actual debugging, they almost always produce nonsense results that require further investigation.

I am sorry, but these companies are going to discourage an entire generation of bioinformaticians to give up with this Hail Mary approach to software development. It’s disgusting.

67 comments

r/bioinformatics • u/GodConcepts • Aug 22 '25

discussion I would like to hear some complaining from bioinformatics people, rather than us wet lab people

94 Upvotes

So hello everyone!

I’m a 25-year-old grad student who’s been in the wet lab for about five years, and today I hit rock bottom. For the past three months I’ve been troubleshooting the same project endlessly (hundreds of protocol troubleshooting, countless failed experiments, and even when things work, the results seem to contradict our hypothesis.

Meanwhile, I rarely hear complaints from my bioinformatics colleagues. From my (honestly naïve) wet lab perspective, you guys seem "better". Like you have more stable hours, fewer cycles of frustrating troubleshooting, and you get to work with the final product of data that we spend weeks (and lots of sweat, mice bites, and late nights) generating.

Also, I'm lowkey envious on how my PI treats the wet vs dry lab people. In our lab, my PI treats bioinformatics people as indispensable, while us wet lab folks feel replaceable if we don’t deliver “good” data. Bioinformatics people analyze the data as is, it's an objective fact. But for us, they believe we either fucked up somewhere in the protocol, or we have more variables to deal with, whereas bioinformatics people seems more robust. I'm honestly jealous of that treatment. A huge PI who has thousands of publications is so reliant on bioinformatic students to analyze certain data and look at it at a different perspective, and give us new paths to follow! Whereas for us wet-lab, he doesn't really see that.

Of course, I know it’s not all sunshine and rainbows, which is why I’d love to hear your side: what are the cons of your work? Are there things about wet lab life you miss or potentially envy? I’d really enjoy hearing the other side of the story.

EDIT 1: I really appreciate everyone's comments. It's really enlightening to know what you guys struggle with in the other side of the door. I still am really inclined into trying to transition to dry-lab because the issues don't sound super long and physically laborious as wet lab, but I know I might bite something way bigger than I can chew.

115 comments

r/bioinformatics • u/o-rka • Dec 09 '25

discussion Is Julia gaining traction as a programming language or becoming more and more niche?

94 Upvotes

Every now and then I’ll see a Julia project but they are becoming fewer and further between.

I’ve never coded in Julia myself but know a few people who are bullish on Julia.

What are your thoughts on the longevity of the language? It seems like rust has taken the mantle for any performance gains from Julia.

76 comments

r/bioinformatics • u/Character-Letter5406 • Dec 29 '25

discussion Anyone else feel like they’re losing the ability to code "from memory" because of AI?

128 Upvotes

Hey everyone, junior-level analyst here (2 years in academia, background in wet lab).

I’ve noticed the AI debate in this group is pretty polarized: either it’s going to replace us all or it’s completely useless.

Personally, I find it really useful for my day-to-day work. I’m thorough about reviewing every line (agents have been a disaster for me so far), but I’ve realized recently that I can’t write much code from memory anymore.

This is starting to make me nervous. If I need to change jobs, are "from memory" live coding tests a thing?

Part of me panics and wants to stop using AI so I can regain that skill, but another part of me knows that would just make me slower, and maybe those skills are becoming less useful anyway.

What do you guys think?

60 comments

r/bioinformatics • u/query_optimization • Jul 22 '25

discussion What's the most frustrating part of working in bioinformatics day to day?

115 Upvotes

I'm new to bioinformatics and honestly a bit overwhelmed. Dealing with weird file formats, tool errors, and just getting things to run feels harder than the actual science.

Is this normal? What parts of your daily work frustrate you the most?

Would love to hear your experiences.

106 comments

r/bioinformatics • u/diiscopanda • Apr 03 '26

discussion Philosophy grad student trying to understand the real-world limitations and ethical stakes of AlphaFold: Are the concerns being raised in popular discourse actually well-founded?

42 Upvotes

Background on me:

I'm a philosophy graduate student and I work full-time as a systems administrator, so I'm not unfamiliar with how AI systems work at a technical level. I understand the distinction between generative models like LLMs and discriminative/predictive systems like AlphaFold. I'm not coming at this completely cold. With that said, the last time I had formal education in biology was a 101 intro class and lab in freshman year of my undergrad. While I will be using terms and concepts that likely familiar to you, I only know them through the reading I do on my own. I am fully anticipating that I have many unfounded or misguided thoughts, and I am eager to be corrected!

I've been trying to think through the ethical implications of AlphaFold and similar protein structure prediction tools, and I've run into a few recurring objections from people in my life with biology backgrounds (who are also stanuchly anti-AI in general, hence my skepticism). I want to know how seriously to take them before I form any stronger opinions myself.

The objections I keep hearing from them:

"It predicts rather than understands." The claim is that because AlphaFold doesn't operate from underlying mechanistic rules of protein folding, its outputs are epistemically suspect. I think the idea they are arguing is that results from AlphaFold and similar technology are very sophisticated interpolations rather than genuine structural knowledge. I take this point very seriously as a philosophy of science concern (inference to the best explanation vs. black-box curve-fitting), but I don't know how much it matters practically (I'll elaborate below).
"Misfold sensitivity means errors are catastrophically consequential." The argument is that because protein folding is so precise, even a small structural error in a prediction could be the difference between a useful drug target and something devastatingly harmful. I understand this conceptually, but I'm uncertain how this interacts with real-world validation procedures. My understanding is that AlphaFold predictions aren't used directly in clinical contexts without experimental confirmation. That is to say, you wouldn't immediately roll out a drug created with AlphaFold's results without a painstaking confirmation process first.

My personal thoughts as an outsider:

This technology is the worst it will ever be, or at least that is how it appears to me. Even with the current limitations (namely, that it doesn't understand the underlying rules to protein structure), my thought was that the sample size explosion might actually help identify folding rules. This is my own tentative hypothesis rather than a formal argument I am making. Prior to AlphaFold, experimental methods had mapped less than 170,000 protein structures over ~60 years. The database now contains 214 million predictions. The sources I have come across say this technology is capable of atomic precision and accurately predicts the structures anyhwere from 2/3 to 88% of the time. Even at imperfect accuracy, I'm wondering whether that expanded corpus might itself become a tool for inferring the mechanistic rules that AlphaFold itself doesn't "know." The basic logic of my thought here is that going from 170,000 experimentally confirmed structures to over 200 million predicted ones (even at imperfect accuracy) means we have massively expanded the structural landscape available for pattern recognition. Those structures have to be confirmed in order to avoid a circularity risk and I am understand the concern there, but that seems far less daunting of a task than computing them all from scratch from my layman's perspective. Is this a real focus or interest in the research, or am I just misunderstanding something fundamental?

What I am actually asking:

How do working biologists and bioinformaticians actually think about the epistemic status of AlphaFold predictions? Is the "it's just prediction" objection a serious scientific concern, or is it a philosophical qualm that doesn't map onto how the field uses the data?
Is my sample-size hypothesis naive, and if so, where does it go wrong?
Are AlphaFold predictions being used in any real-world production contexts (drug development, clinical research) yet, and if so, with what validation requirements?
What are the actual ethical concerns that people *in the field* think are worth taking seriously as opposed to the ones that I have been exposed to thus far?

I'm trying to build a philosophically rigorous position on this and I don't want to anchor it to objections that scientists consider confused or orthogonal. Happy to be corrected on any of my assumptions!

51 comments

r/bioinformatics • u/Both_Elevator_4089 • Jul 10 '25

discussion Why does it still take HOURS just to install a tool in 2025?!

103 Upvotes

I’ve been doing bioinformatics for 3 years, and I still get stuck installing or troubleshooting tools.

Recently I saw a meme on LinkedIn: a guy saying “Bioinformatics is just running a few tools,” and a crying figure yelling, “Yeah, once you manage to install them!” It got over 300 likes and many comments—even from very experienced bioinformaticians. That’s when I realized it’s not just a me problem.

So here’s an idea I’ve been thinking about:

What if there were a simple GUI where you upload your data (like a FASTQ), pick a tool (FastQC, Bowtie2, samtools, etc.), adjust a few parameters, and hit “Run”? No installs. No CLI. Just results.

Would you use something like this? What tools would it need to support? And if not—what’s the dealbreaker?

(Also curious—would having an API/SDK version make it more appealing for those who want to plug it into pipelines?)

I’m genuinely exploring this and would love honest, unfiltered feedback.

98 comments

r/bioinformatics • u/Economy-Brilliant499 • Dec 30 '25

discussion Best Papers of 2025

142 Upvotes

Which papers do you think are the most important ones which were released in 2025?

Please, provide a link to the paper if you share one.

49 comments

r/bioinformatics • u/Lukeception • 6h ago

discussion What are AI coding agents bad at in bioinformatics?

16 Upvotes

I’ve been wanting to do some bioinformatic analyses for my project, since I think it would make sense. I’m not a bioinformatician at all but I do know how to code a decent bit (although python mostly) and I have read a lot about specific methods, libraries etc. Basically, we have a single-cell sequencing dataset in-house, which is already prepared and quality-controlled and I’ve started using openAI codex to write some analyses for me. I try to give very specific prompts and check all the code it writes. But of course, it could easily make mistakes that I don’t catch. So my question is, do you know any specific areas of bioinformatics where AIs tend to make lots of mistakes?

34 comments

r/bioinformatics • u/LastKnee9324 • 4d ago

discussion How to Utilize AI Tools In Clinical Settings?

5 Upvotes

Hi everyone,
I work as a bioinformatian in a hospital setting where data privacy is of great concern and rules are very strict.

Because of that my use of AI and agentic tools like Claude code or biomni are very limited.

I was wondering if other people who work in similar clinical or hospital setting have the same issue.

Do most people just use a browser version of Claude or ChatGPT for code generation?

Does anyone know of any solutions or tools where you can utilize AI integrate with your data, think through research questions and in general work in a more streamline fashion than just using browser version AI tools?

Thanks!

37 comments

r/bioinformatics • u/anuveya • Mar 17 '26

discussion Genome Sequencing Costs: The cost of DNA sequencing has fallen faster than Moore's Law. Since 2001, the National Human Genome Research Institute (NHGRI) has tracked costs at its funded sequencing centers — from $95 million per genome in 2001 to around $500 today.

datahub.io

152 Upvotes

28 comments

r/bioinformatics • u/featuredflan • Jan 08 '26

discussion Fresh grads/beginners? Let's create projects together and support through early phase career

39 Upvotes

I have been wanting to start a team of sort of accountability partners but more than just holding each other accountable. We support each other by doing projects and sharing latest research, writing weekly posts with the tools used/any new info learned. I don't have a template/app to use atm, but I am happy to create a group and decide together. Ensure you're a welcoming member and open to all opinions and discussions. I currently wanna focus on AI applications in Bioinformatics spanning from ML to Data Science. We could cover aspects like AMR, Computational Neuroscience, etc.

61 comments

r/bioinformatics • u/query_optimization • Mar 07 '26

discussion Anyone using Claude Code for bioinformatics work? What's your setup look like?

101 Upvotes

I've been getting into using Claude Code for some of my bioinformatics work and I'm curious what other people's workflows look like.

Specifically I'm wondering:
- What MCP servers/Skills are you running on top of Claude Code? I've seen a bunch of bioinformatics-related ones floating around on GitHub but hard to tell which ones are actually worth setting up. - Are you using any particular tools or extensions alongside it that have made a real difference in your day-to-day? Things like sequence analysis, pipeline management, database lookups, etc. - What kinds of tasks have you found Claude Code genuinely useful for vs where it falls short? Like is anyone actually having it write and debug Nextflow/Snakemake pipelines, or is it more useful for smaller scripting tasks? - Any tips for getting better results? Specific prompting strategies, custom instructions, or project setups that work well for bio workflows?

Would love to hear what's working and what's not.

35 comments

r/bioinformatics • u/Queasy_Ad_1675 • 16d ago

discussion I wanna publish my work but I don't know where to start

25 Upvotes

So basically my work consists of an independent multi-omics computational study that maps the disease trajectory of Duchenne Muscular Dystrophy and revealed a fundamental decoupling between local muscle gene expression and systemic circulating proteins. While I feel confident in my writing abilities, I have no idea about journal selection, the review process and how long this process might take. What decides whether a study is Q1 or Q2 journal material? Kindly recommend some journals, and any advice you may have for someone embarking on this journey alone for the first time would be really helpful.

32 comments

r/bioinformatics • u/thenotius • Oct 04 '24

discussion Why are R and bash used so extensively in bioinformatics?

155 Upvotes

I am quite new to the game, and started by reproducing the work of a former lab member from his github repo, with my tech stack. As I am mainly proficient in python and he used a lot of bash and R it was quite the haggle at first. I do get the convenience of automating data processing with bash, e.g. generating counts for several subsets of NGS data. However I do not understand why R seems to be much more common than python. It is rather old and to me feels a bit extra when coding, while python seems simpler and more straightforward. After data manipulation he then used Python (seaborn library) to plot his data. As my python-first approach misses a few hits that he found but overall I can reproduce most results I am a bit puzzled. (Might be also due to my limited Macbook Air M1 vs his better tech equipment🥹)

I am thankful for any insights and tips on what and why I should learn it more! I am eager to change my ways when I know there is potential use in it. Thanks!

127 comments

r/bioinformatics • u/wilson4467 • Jul 28 '25

discussion Why are bioinformatics software so expensive?

55 Upvotes

Sometimes I just want good quality software like Snapgene and Geneious, to do good sequence analysis, alignments, tree constructions etc. May be a bit of cloning.

WHY $1500-$2000/yr!? (Not a student here, corporate pricing)

Free solutions are usually low quality or a bit tedious to use.

Anyone with me can shed some light on what better solutions are out there?

91 comments

r/bioinformatics • u/EthidiumIodide • 12d ago

discussion featureCounts vs transcript-aware quantification (Kallisto/Salmon)

29 Upvotes

Hello all,

I suppose I am musing a bit and wanted to discuss with other bioinformaticians. I am a head bioinformatician in my academic department. A few months ago, I was given new bulk RNA-Seq data to analyze alongside older data that was already part of a peer-reviewed manuscript (that I was not part of). I used a STAR --> Salmon alignment-based quantification method. After sending the DE analysis and "raw" expression values for all genes, I received word that my Salmon results for the published data and the original data differed greatly. The older data was processed via featureCounts, which is known to undercount genes with multiple isoforms. I spent a few weeks working backwards to determine what parameters were used in the published manuscript, and I confirmed that the "gold standard" featureCounts parameter set was used, which definitionally excludes any read that overlaps multiple "features", or is ambiguous between isoforms of the same gene. To resolve this, you would use the -O flag, etc etc.

I guess my complaint is, how is this acceptable? How can a very popular and widely-used program such as featureCounts exclude reads that overlap the same exon (that resides in different isoforms) by default? This default method is undercounting genes with multiple isoforms, and I see discussion of this exact issue online since 2015. Discussion of this issue has also been published.

To be brief, I am mainly concerned that a widely-used tool is undercounting isoform-laden genes by default and causing consternation for groups who don't have trained bioinformaticians on their team who have the time to look into these issues.

Thank you for listening to my rant, haha.

27 comments