If you are at the IMP or IMBA then just log in with your username/password, no registration needed. GMI and MFPL members please use firstname.lastname as your username, and your professional e-mail address when registering.

Welcome to the BioComp Knowledge Hub! You can ask questions here related to bioinformatics, statistics, computational biology and similar subjects. You can also answer questions and rate other users' contributions.

Frequently Asked Questions

Merging rows in R

0 votes
I have predictions of orthologs between multiple species (A-B, B-C, A-C) and now I want to join all the data together to see which genes are shared among all the species, which are shared among subsets and which are unique to species. I used the merge.data.frames command to merge the dataframes together and for the most part things look as expected.


81    sr10108       BN887_04349    SPSC_02689
82    sr10109       BN887_04348    NA
83    sr10112.2    BN887_04345    SPSC_02690

In some cases though I have a problem where some rows merge as expected and I get a result like this:

941    BN887_01039    SPSC_05463    sr10904
942    BN887_01040    SPSC_05465    NA
943    BN887_01040    NA                       sr10908

BN887_01040 has been predicted to be orthologous to both SPSC_05465 and sr10908 but they weren't predicted to be orthologous to each other. I think it should be fine to assume they are. If A = B and A = C then it must follow that B = C. However, I do not know how to merge rows like that so that there is only one entry per gene name.

How can I join rows like that so that they display as "BN887_01040    SPSC_05465   sr10908"? I have tried variations of the merge command itself but none seem to have solved this problem.
asked Feb 17, 2015 in Bioinformatics by jason.bosch (220 points)

1 Answer

0 votes
In the end I just used a different programme which identifies orthologs in multiple species at the same time.

answered Sep 1, 2016 by jason.bosch (220 points)